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In this Issue 

The Massachusetts Institute of Technology's X Window System. Version 
1 1 , has become an industry standard window system tor supporting user 
interfaces in networks of workstations running under AT&T's UNIX operating 
system. In Hewlett-Packard terms, this means HP 9000 Series 300 and 800 
workstations running under the HP-UX operating system. The X Window 
System lets an application program running on one workstation display infor- 
mation to a user sitting at any workstation in the network. HP 9000 Series 
300 800 workstations also offer a high-performance 2D and 3D graphics 
library called Starbase. Naturally, users wanted their application programs 
to be able to use the Starbase graphics library and run under the X Window System. Unfortunately, 
they couldn't do both at once. The two systems had been designed independently, and both 
assumed exclusive ownership of the display and input devices. Furthermore, while many X 
applications could be active in the network simultaneously, only one Starbase application could 
run on a workstation. As a result of these differences, the two systems couldn't coexist. Working 
out a solution to this problem required a joint effort of engineers at two HP Divisions, dubbed the 
Starbase X11 Merge project. Merging Ihe two systems was a nontrivial technical challenge. It 
had to be done without sacrificing the performance of Starbase applications or requiring that they 
be rewritten. As related in the article on page 6, it required changes to the architecture of both 
systems, development of cooperating display drivers for the two systems, restructuring the interface 
between the drivers and the X server process, and development of a facility to handle communi- 
cation between the two systems. In other articles, you'll find details of the changes as they relate 
to the management of graphics resources (page 12), access to display hardware (page 20), use 
of display memory (page 33), sharing of input devices (page 38), and modification of existing test 
suites (page 42). 

The capabilities of the Starbase graphics library include high-performance color rendering and 
3D solids modeling. For determining the intensity of light reflected to the observer's eye from any 
object, the library offers three illumination models — one local and two global. A local model 
considers only the orientation of an object and light from light sources. A global model also 
considers light reflected from or transmitted through other objects in the scene. The two Starbase 
global illumination models are based on methods called ray tracing and radiosity. In the paper 
on page 78, David Burgoon presents the mathematical foundations of the radiosity method and 
compares its capabilities and limitations with those of the ray tracing method. 

The Starbase graphics library runs on HP 9000 Computers equipped with the SRX or TurboSRX 
graphics subsystems. The TurboSRX is an enhanced-performance version of the SRX design. 
On page 74. Larry Thayer explains how analysis of the data-flow pipeline of the SRX revealed 
where custom VLSI chips could be used to improve the performance. He then describes three 
chips that were designed to take advantage of these opportunities for the TurboSRX version. 
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For HP's commercial computer systems based on the HP 3000 Computer, the last resort in 
troubleshooting usually involves analyzing a dump of the computers memory While powerful 
tools have evolved for on-line dump analysis, until recently no parallel progress had occurred 
that would allow efficient on-line examination of operating system source code. After finding clues 
in the memory dump. HP support engineers had to rely on a complex manual process to locate 
specific source code in a printed listing. Fortunately, this isn t true anymore. HP support facilities 
now have HP Source Reader, a system for accessing source code stored on compact disk 
read-only memory, or CD-ROM. The source code is stored on the CD-ROM in a proprietary format 
and is retrieved by an access program that runs on an HP Vectra Personal Computer and allows 
relevant information to be popped onto the screen in seconds. On page 50, three of the system s 
designers — support engineers themselves — describe HP Source Reader and present an example 
of its use. 

As integrated circuit clock rates and signal transitions have become faster and faster, it has 
become necessary to treat even very short wires and printed circuit board traces as transmission 
lines. This means that impedance matching, reflections, and propagation delays are important 
considerations. In automatic testers for such high-speed devices, transmission line techniques 
must be applied to the tester-to-device interconnection if the device is to be tested at operating 
speeds and accurate results are required. The paper on page 58 describes how this interconnection 
is implemented in the HP 82000 IC Evaluation System to ensure high-precision measurements 
even for difficult-to-test CMOS devices. A resistive divider arrangement makes it possible to test 
low-output-current devices up to their maximum operating frequencies. 

R.P. Dolan 
Editor 



Cover 

This HP 9000 Series 300 display shows the results obtainable using a Starbase X1 1 Merge 
system display mode called combined mode. This mode takes advantage of the sophisticated 
rendering capabilities of the TurboSRX 3D graphics accelerator, causing the two sets of display 
planes — image and overlay — to be treated as one screen. The complex 3D images were rendered 
m the image plane and the listing, the dock, the buttons, and the plot were rendered in the overlay 
plane. 



What's Ahead 

The HP OSI Express card provides on one HP 9000 Series 800 I/O card the capabilities of the 
network architecture defined by the ISO Open Systems Interconnection (OSI) Reference Model. 
In the February issue, ten articles will provide insight into the OSI Express card implementation 
of the model and will define what sets this implementation apart from other networking implemen- 
tations. Also featured will be the HP 71400A Lightwave Signal Analyzer, which measures the 
characteristics of high-capacity lightwave systems and their components, including single-fre- 
quency or distributed feedback semiconductor lasers and broadband pin photodetectors. An 
accessory, the HP 1 1980A Fiber Optic Interferometer, helps characterize the spectral properties 
of single-frequency lasers. 
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System Design for Compatibility of a 
High-Performance Graphics Library and 
the X Window System 

The Starbase/X1 1 Merge system provides an architecture 
that enables Starbase applications andX Window System 
applications to coexist in the same window environment. 

by Kenneth H. Bronstein, David J. Sweetser, and William R. Yoder 



HP'S HIGH-PERFORMANCE 2D and 3D GRAPHICS 
library called Starbase has proven very successful 
in engineering workstation applications. Similarly, 
The X Window System" 1 Version 11, or XI 1, has become 
the de facto industry standard window system for support- 
ing user interlaces on workstations connected across a net- 
work. ' ~ Both of these systems run in the HP-UX environ- 
ment on the HP 9000 Series 300 and 800 Computer systems 
(see boxes on pages 7 and 8). 

Before the Starbase/Xll Merge project, the X Window 
System and Starbase graphics applications were not able 
to run on the SCOBS display. An application could use either 
the Starbase high-perlormance graphics or it could run in 
the X Window System, but not both simultaneously. These 
systems each make simple assumptions about ownership 
of the display and input devices, and this makes them 
unable to coexist. Since HP is one of the industry leaders 
in the X Window System technology and Starbase is a 
widely used graphics library, the Starbase/Xl 1 Merge proj- 
ect was started to design and implement a scheme whereby 
X and Starbase applications could coexist on the same 
display. 

There were three major challenges associated with merg- 
ing Starbase and X1 1. The first challenge was to change 
the architecture of the Starbase graphics libraries and the 
X Window System so that a Starbase application could run 
within an X window with full functionality and with per- 
formance comparable to Starbase running on a dedicated 
(nonwindowedl display. The second important challenge 
was to enable existing Starbase applications to relink sim- 
ply with the new Starbase drivers and run in an X Window 
System with no modifications to the application's source 
code. The final major challenge was to coordinate the de- 
sign and development of this product over geographical 
and organizational boundaries. The Starbase/Xll Merge 
project w-as the joint effort of software engineers located at 
HP's Graphics Technology Division (GTD) in Ft. Collins, 
Colorado, and HP's Corvallis Information Systems Opera- 
tion (CIS) located in Corvallis, Oregon. The team in Col- 
orado was responsible for the Starbase portion of the project 
and the team in Corvallis was responsible forthe X Window 
System portion of the project. 

This article and the next five articles in this issue describe 
the design and implementation techniques used to handle 
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Fig. 1. Incompatible architectures, (a) The architecture tor 
an X application (b) The architecture for a Starbase applica- 
tion. Both architectures assume complete ownership of the 

display. 
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The Starbase Graphics Package 



Starbase is a library of utilities 'or drawing corr.puler graphics 
11 was first releasee m 1985. based on a draft of tne ANSI ana 
ISO standard Computer Graphics interface, or CGI Since its 
first release, features nave been added to Starbase that go 
beyond the CGI standard The library includes functions that 
draw lines, polygons, text, splines circles, and arcs It includes 
routines thai read locations or button and key presses from input 



these challenges- 
Design Alternatives 

The architectures for a client (application) running in 
the X environment and an application using Starbase are 
shown in Fig. 1. The X Window System is network trans- 
parent, which means that an application running on one 
workstation can display itself to a user sitting at the same 
workstation or at another system across a network. Appli- 
cations, or clients, running in the X Window System are 
allowed access to the display only through the X server, 
which is a separate process that arbitrates resource conflicts 
and provides display, keyboard, and mouse services to all 
applications accessing the display. Also, as shown in Fig. 
1. many X applications can be served by the X server simul- 
taneously. Starbase. on the other hand, is a collection of 
libraries and drivers for 2D and 3D graphics applications, 
and only one Starbase application can run on the worksta- 
tion at a time. 

In trying to merge Starbase and X." we did not lack 
alternative solutions. During the investigation stage there 
was little doubt that we could change the architectures of 
Starbase and X to coexist, but how to merge the two was 
not clear. The design alternatives included: 

■ Following the existing HP Windows/9000 model of add- 
ing window management utilities to the Starbase li- 
braries. 

■ Implementing the X server on top of the Starbase 
graphics libraries. 

■ Implementing the X server using an internal low-level 
Starbase interface. 

■ Implementing an X driver for Starbase, using X Window- 
System Xlib calls. 

■ Writing an X extension that implements Starbase low- 
level semantics. 

■ Developing Starbase and X drivers that cooperate in ac- 
cessing the display hardware. 

The project team selected the last alternative. This ap- 
proach resulted in creating low-level drivers to support the 
rendering requirements of both Starbase applications and 
the X server, the restructuring of the server interface be- 
tween the low-level drivers and the device-independent 
portion of the X server, and the development of a facility 
to handle communication between X and Starbase. 

Low-Level Driver Redesign 

Tin; Graphic s Technology Division manufactures a vari- 
ety of display types with the following characteristics: 

■ On-screen resolutions that range from 512 by 400 pixels 
to 1280 by 1024 pixels. 

■ Display planes that range from 1 (capable of displaying 
black and white| to 24 (capable of displaying any of 16 
million colors, with every available pixel a different 
color). 

» Advanced hardware features, such as 2D and 3D graphics 
accelerators. Graphics accelerators provide graphics op- 
erations such as polygon clipping, rotation, and other 
transformations implemented in high-speed hardware. 
To put the responsibility where the expertise lay and to 

The X Wmoow System is a trademark ol the Massachusetts Institute ol Technology 

•In this anil oilier arliCles XI I and X Window System will also simply be 'olerred lo as X 



devices, and routines that echo me position of an input device 
on an arbitrary display 

An important goal of the Starbase product is to provide a library 
of functions that can be used on a range of devices Starbase 
conceals the details of device dependencies allowing each pro- 
gram to be used with a growing list of devices without making 
cnanges to the program The current Starbase products support 
over 20 different devices They include workstation displays, plot- 
ters, terminals, mice and data tablets New devices can be used 
as they become available by linking a program with new device 
drivers This device independence is also used to assist the 



accommodate all these display types, the engineers at C.TD 
implemented the new display drivers, and the engineers 
at CIS implemented the code to translate X server semantics 
into display driver formats, The interface between the dis- 
play drivers and X was called the X driver interface, or 
XD1. XD1 is discussed later in this article. 

During the design investigation phases, we discovered 
that many requirements of the Starbase environment and 
the X server environment were similar and the basic al- 
gorithms that use the hardware were the same. This led to 
the concept of shared drivers between the X server and 
Starbase applications. Originally we hoped that the drivers 
could be shared at the object code level, that is, the drivers 



development of other graphics libraries Implementations of li- 
braries tor the ANSI standards Core Graphics System (CORE). 
Graphics Kernel System (GKS). and Programmers Hierarchical 
Interactive Graphics System (PHIGS) use the Starbase device 
drivers to support the same range ol devices. 

The device independence ol Starbase coexists with access 
to the full features and maximum performance of each device 
that it works with. Common features, such as line and polygon 
drawing, are supported directly on capable devices and emu- 
lated on simpler devices The more sophisticated features of 
advanced displays, such as shaded images, are available to 
programmers that require these features, but not emulated on 
simpler devices 

Statbase has features tuned to the needs of particular groups 
of customers Some additions optimize strictly two-dimensional 
graphics, such as for printed circuit layout electrical design 
and drafting Functions have been added to Starbase to support 
integer coordinates and transformations that allow faster, more 
cost-effective display systems for these applications. Other ad- 
ditions emphasize three-dimensional images such as used tor 
advanced mechanical design Starbase supports perspective 
views of obiects with shading simulating light sources, and draws 
only those parts ol an image that are not hidden behind solid 
obiects. The most recent additions to Starbase provide photo- 
realism, the appearance of near reality, through ray tracing and 
radiosity technologies See the article on page 78 for more infor- 
mation about radiosity 
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The X Window System 



The X Window Syslem. commonly referreo to as X. is an indus- 
try standard, network transparent window system X presents to 
the user a hierarchy of resizable overlapping windows providing 
device independent graphics A graphical user interface is com- 
monly included as an integral part of the X window syslem The 
X Window System definition is maintained by the Massachusetts 
Institute of Technology X Consortium 

The first implementations of X were developed jointly at MIT 
by Proiect Athena and the Laboratory for Computer Science 
Proiect Athena was faced with the problem of writing software 
for hundreds of displays from different vendors on machines all 
connected by a local area network They designed X, based on 
the W window system, which was the work of Paul Asente, Brian 
Reid. and Chris Kent of Stanford University and Digital Equipment 
Corp 

The 1986 MIT release of X, Version 10.4, was the first version 
with multivendor support HP was among the first computer man- 
ufacturers worldwide to sell X as a product when in March 1987 
the company began shipping the X Window System for HP-UX 
In January 1988 the MIT X Consortium was formed, with HP 
being one of the founding members X Consortium members 
include Apple Computer Inc . Ardent Computer, American Tele- 
phone and Telegraph Inc., Calcomp Inc , Control Data Corpora- 
tion, Digital Equipment Corporation, Data General Corporation. 
Fuptsu Microelectronics Inc , Hewlett-Packard, International Bus- 
iness Machines Corporation, Eastman Kodak Corporation, NCR 
Corporation, Nippon Electric Corporation. Prime Computer Inc., 
Silicon Graphics. Sun Microsystems Inc., Tektronix Inc.. Texas 
Instruments Inc.. Unisys, Wang Laboratories Inc.. Xerox Corpo- 
ration, and others 

The X Window System designers. Robert Scheiffler of MIT and 
Jim Gettys of Digital Equipment Corporation, adopted a set of 
critical design objectives, specifying that the window system 
must: 

• Work on a wide variety of hardware platforms and displays 

■ Facilitate implementation of device independent applications 

■ Be network transparent 

■ Allow for application concurrency 

■ Support differing application and management interfaces 

■ Provide overlapping windows and output to obscured regions 
of windows 

■ Support a hierarchy of resizable windows 

■ Provide support for text, 2D graphics, and imaging 

■ Be extensible. 



Their implementation of this design has gone through a number 
of revisions The implementation has stabilized at X version 11. 
which has been adopted as an industry standard The current 
standards bodies that have adopted some portion of X or are in 
the process of adopting X include ANSI. IEEE ISO (International 
Standards Organization), NIST (National Institute of Standards 
and Technology). OSF (Open Software Foundation), and X/ 
OPEN MIT has facilitated the acceptance of X as a standard by 
distributing the standards definition documents and the source 
code of sample implementations for public use for a nominal fee 

The X Window System consists of the X server, the standard 
X library, various library toolkits, and a set of X client applications 

■ The X server controls access to display hardware and input 
devices. 

■ The X library is the basic programmatic interface providing a 
standard method to manipulate windows, control input, handle 
window system events, provide text output, manipulate color 
maps, render 2D device coordinate graphics, and extend the 
client/server protocol. 

■ The X toolkits provide standard sets of widgets, menus, and 
other user interface obiects The toolkits facilitate the develop- 
ment of applications thai have a consistent, easy lo use 
graphical user interface 

■ A window manager >s provided as a special X application 
The functionality of the window manager has been separated 
from the lower-level X server and X library This modular design 
has allowed different window managers and different user inter- 
face models to be incorporated m any user's X environment 
The X server and the X library communicate via an asynchron- 
ous stream-based interprocess communication protocol. This 
protocol separates the application interface from Ihe X server 
implementation The X server can then be ported to new display 
devices without the need to modify the application programs 
Executable application code compatibility is maintained across 
displays This network protocol also provides the basis of network 
transparency and interoperability Network transparency means 
that an application running on one computer can perform all 
display and input operations for a user sitting either at the same 
system or at another computer across the network. Network trans- 
parency is provided at no cost lo the application as part of the 
standard X implementation Interoperability implies that network 
transparency is preserved across various computer vendors' 
products 



could be compiled once and linked into both the X server 
and Starbase programs. The data structure environments 
and some of the rendering semantics of the two environ- 
ments were too different to allow this, so the less restrictive 
alternative of shared source code with conditional compi- 
lation was chosen. This scheme enabled us to avoid chang- 
ing existing Starbase library code and duplicating low-level 
display control and rendering operations for different dis- 
play types. 

Restructuring the X Server 

A sample implementation of the X server exists in the 



public domain and is available from the Massachusetts 
Institute of Technology (MIT). This sample implementation 
has contributed greatly to the success of the X Window 
System. The X server maintained by MIT provides X ven- 
dors with a source code template from which X server 
products can be developed. Starting with MIT's sample X 
server, vendors can develop a version of the X server that 
works on their hardware. The sample X server consists of 
three major sections: 

■ Device Independent X (DIX). High-level device indepen- 
dent code for handling cursors, events, extensions, fonts, 
and rendering requests. 
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Fig. 2. The modules in the X server The device dependent 
module (DDX) shows the modifications made to accommo- 
date the needs of the Starbase' XII Merge system 

Operating System Dependent Interface. This section con- 
tains utilities used primarily by DIX to perform tasks 
specific to the host operating system. For example. DIX 
makes no assumptions about the structure of the host's 
file system or about how to open communication chan- 
nels — these details are handled by the code in this sec- 
tion. 

Device Dependent X (DDX). DDX contains the code that 
performs device dependent I/O. For example, when a 
client asks the X server to draw a circle or to display 
text. DIX code interprets the request and passes it to the 
appropriate procedure in DDX for proper device depen- 
dent I/O. Conversely, when the user moves the mouse 
or types on the keyboard. DDX conveys this information 
to DIX for processing. DIX passes the information hack 
to interested clients. 



To handle our needs, the DDX layer was split into two 
more layers: a translation module and the X display drivers 
(see Fig. 2). The translation module, which was written by 
the engineers in Corvallis, translates the data formats and 
requests from DIX into a form suitable for the X display 
drivers. The X display drivers, which were written by the 
engineers at GTD in Colorado, do the rendering to a particu- 
lar display. Between these two layers is the X driver inter- 
face (XDI). The X driver interface contains about four dozen 
driver entry points, the corresponding data structures, and 
a strict protocol for accessing the entry points. 

This organization of DDX provided two benefits. First, 
it enabled us to carry on development at two separate loca- 
tions and organizations, and second, it helped to eliminate 
redundant Starbase and X display driver code develop- 
ment. The functions provided by XDI include: 

■ Driver and device control 

■ Color map manipulations 

■ Accelerated graphics window support 

■ Cursor, raster, filling, vector, and text operations. 

The translation module, which translates rendering re- 
quests from DIX into a format appropriate for the low-level 
X display drivers, can be very simple as in the following 
DDX-lo-XDI routine which handles the DIX request to fill 
horizontal rows of pixels. 



void 

FillSpans(pDrawable, pGC. nlnit, pptlnit, pwidthlnit. tSorted) 

DrawablePtr pDrawable: 

GCPtr pGC: 

int nlnit; 

DDXPointPtr pptlnit; 

int "pwidthlnit; 

int (Sorted; 
DECLARE.XDLPOINTERS 
GET_XDIJNFO 
PREPARE_TO_RENDER 



' pointer to drawing surface ' 
' pointer to the graphics context " 
" number of spans to till "/ 
■ pointer to list ot start points ' 
f pointer to list of n widths ' 
P ignored */ 

/"set up data pointers "/ 
' get information */ 
'set up display hardware */ 
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Fig. 3. StarbaseiXU Merge soft- 
ware architecture. 
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P FillScanlme is an XDI routine thai accomplishes the 
(ill request ' 

('(pxdiGCJumpTable— FillScanline))(pxdiRender. 
(pxdiDrawable.pGc, 

nlnit, • number ol spans to fill ' 

(int16 "Mpptlnit). • pointer to list of start points " 

(int32 "Mpwidthlnit)): " pointer to list of n widths " 

FOLLOWUP.RENDERING P restore state V 

To allow processes to acquire specialized information 
from the X server and to make specialized requests to Iho 
X server, a small number of extensions were added to the 
X server so that Starbase applications could: 

■ Register Starbase windows with the server 

■ Retrieve the current list of rectangles that define win- 
dows visible on the screen 

■ Set up an error handler 

■ Note changes to the hardware color map. 

Resource Sharing 

To facilitate the exchange of information between Star- 
base and X, and to allow multiple processes to share off- 
screen memory and other display resources efficiently, the 
graphics resource manager {GRM) was developed. The 
CRM does not access the hardware directly because it is 
designed to function as a notepad on which Starbase and 
X can both write information regarding their use of display 
resources. The GRM also keeps track of shared resources 
so that both X and Starbase applications can coexist on the 
same display. See the article on page 12 for more informa- 
tion on the graphics resource manager. 

Starbase X11 Architecture 

Fig. 3 depicts the basic software architecture for the Star- 
base/Xll Merge project. The figure implies that X and Star- 
base are both accessing the display at the same time. The 
design allows for any number of Starbase applications and 
any number of X clients to coexist on the same display. 



The role of the GRM in this figure is to allocate resources 
among cooperating X server and Starbase processes. 

Fig. 4 shows the architecture of a "window-smart" 
graphics application that makes programmatic use of both 
Starbase and X from within a single program. This facility 
allows Starbase programmers to useX rendering facilities to 
enhance the usability and appearance of their applications. 

Conclusion 

The Starbase/Xll merge project occurred in an era of 
increasing complexity in computer software. Software proj- 
ects are getting larger and more geographically distributed. 
This complexity is also being faced during a time when a 
new tactical model has emerged in the computer industry. 
Diverse groups (sometimes involving a company's com- 
petitors) are forming alliances to achieve a greater goal than 
any entity could achieve alone. The Massachusetts Institute 
of Technology X Consortium is a successful example of 
this new model at work. 
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Starbase XI 1 Merge Glossary 



Because some ol the terminology used here and in Ihe rest 
of the of the articles m this series may be new or specrfic to 
StarbaseOd 1 or they may Be used Before they are exp'ained 
tne following terms are defined 

Backing Store. _ocations In offscreen or virtual memory where 

the contents of a winaow are backed up if a window becomes 

obscured Because of some winaow system or user action 

Bit Map. A pixmap having a depth of one On monochrome 

displays the X server maintains all pixmaps as bit maps 

Clip List A hst of rectangles representing the obscured ana/or 

unooscured areas of a window 

Clipstamp. An integer associated with a window, that is used 
to determine the current validity of a list of clipping rectangles 
associated with that window 

Color Map. A set of hardware registers that maintain the red- 
green-biue components of individual pixels. Pixel values, which 
are commonly in the range of 0 to 255. serve as indexes into the 
color map. 

Combined Mode. An X server operating mode on the TurboSBX 
display in which the overlay and image planes appear as a 
single, integrated set to the user 

Cursor. An indicator on the screen used to direct the user's 
attention The X cursor (or input pointer) traverses the whole 
display, whereas Starbase cursors (commonly referred to as 
echoes) move within individual Starbase windows 
DDX. Device Dependent X The portion of the X server devoted 
to handling device dependent I/O 

DHA. Direct Hardware Access. A method that allows a Starbase 
application to bypass the X server and render directly to the 
frame buffer 

Display Enable Begister. A hardware register that controls 
which planes of the display are v.ewable Starbase and X use 
the display enable register to implement double buffering 
DIX. Device Independent X A section of the X server that contains 
a scheduler, a resource allocator, a high-level color map. and 
code for handling window functions, such as cursors, events, 
extensions, fonts, graphics context, and rendering 
Drawable. A logical raster (on the screen or in memory) upon 
which X and Starbase can draw Windows and pixmaps are both 
types of drawables 

Double Buffering. A graphics technique to enhance the smooth- 
ness of motion The technique works by using the display enable 
register to toggle between two buffers While one buffer is being 
rendered into, the other is displayed When rendering to Ihe 
hidden buffer is complete, the display enable register is changed 
and the hidden buffer is displayed and Ihe previously displayed 
buffer becomes Ihe new hidden buffer 

Frame Buffer. The video memory of a display device m which 
each element represents one picture element, or pixel The frame 
buffer is divided into two parts, on-screen memory (current image 
on Ihe screen) and offscreen memory (graphics memory that is 
never visible) 

Graphics Context. A self-consistent set of attributes such as 
foreground and background colors, line styles, and fill patterns 
which are used by X clients to specify how the X server should 
render the drawing requests it receives 
Gopen (Graphics Open). The Starbase action of opening a dis- 
play device or window to create a virtual device thai Starbase 
can render to. 



GBM. Graphics Resource Manager The GRM is a process that 
nand'es requests from the X server and Staroase applications 
for display resources such as offscreen memory and snared 
memory 

linage Planes. The primary display memory on HP s display 
systems used for rendering complex images 
MOMA Windows. Multiple ooscurabie movable, and acceler- 
ated windows Hardware logic in the graphics accelerator pro- 
vides very fast drawing and clipping of multiple windows 
Naming Conventions. The following conventions apply to proce- 
dures mentioned in these articles 

a X<name> is a standard X library procedure (e.g., XGetWin- 
dowlnfo) 

■ XHP<name> is an HP X-extension library procedure (e.g., 
XHPGetServerMode) 

■ xos<riame> is a procedure inside the X server, located m the 
translation layer between DIX and the X display drivers 

a <name> without any prefix is typically an application-level pro- 
cedure, but must be interpreted in context 
Offscreen Memory. A portion of the frame buffer that cannot be 
displayed on the monitor In all other respects, offscreen memory 
behaves the same as on-screen (visible) memory Starbase and 
X use offscreen memory to hold character, cursor, pixmap, and 
scratch information for rapid transfers to on-screen memory. 
Optimized Font. A character set that has been placed into off- 
screen memory to increase its display output performance 
Overlay Planes. Planes of display memory that are visually on 
top of or in front of the image planes These planes are disabled 
or set to a Iransparent color to view the image planes 
Pixel. The smallest addressable picture element ol a display 
Typical HP displays have between one and two megapixels 
Pixel Value. A numeric value, typically between 0 and 255, which 
determines the color of an individual pixel 
Pixmap. A hidden rectangle of raster data which is maintained 
in offscreen memory when there is room, and in virtual memory 
when there is no room in offscreen memory 
Raster Data. A data structure described by a two-dimensional 
array of pixel values. 

Raw Mode. Running a Starbase application without any window 
system 

Rendering. Any lorm of drawing operation, including text, line, 
and raster output Rendering may occur to on-screen memory, 
off-screen memory or virtual memory 

Sample Server. The X11 server template source code made 
available to the general public by the X Consortium that enables 
X vendors to develop servers for their own products. 
Scanline. A horizontal row of pixels 

Shared Memory. A contiguous area of process data space that 
is shared with another process The X server and Starbase appli- 
cations use shared memory for communication and sharing lonts. 
color maps, and other display resources. 
Socket. A communications channel between two HP-UX pro- 
cesses There are two types of sockets: internet sockets, which 
are communication channels between machines across a net- 
work, and HP-UX domain sockets, which provide laster communi- 
cation within the same machine 

Stacked Screens Mode. X Server operation on overlay and 
image planes in which Ihe two sets of planes are treated as 
separate display devices. 
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Stacking Order. An ordering imposed on a sel of windows thai 

represents the apparent visual ordering of the windows lo the 

user. For a window to be at the top of the stacking order means 

that it cannot be occluded by any other window 

Tile. A pixmap replicated many times to form part of a larger 

pattern 

Transparent Color. A pixel value in the overlay planes thai 
causes the information in the image planes to be displayed In- 
stead of the information in the overlay planes. 
TurboSRX. A 3D graphics subsystem that includes a triple trans- 
form engine, a scan converter, a 16-bit z-buffer, four overlay 
planes, and up to 24 image planes The TurboSRX also includes 
the microcode to provide interactive 3D solids rendering, photo- 
realism, and window clipping capabilities 
Virtual Memory. Memory that the HP-UX operating system allo- 
cates to an executing process. It is called virtual because al- 
though the memory appears to be in physical memory to the 
process, the system may swap it to and from a disk. The X display 
drivers are capable of rendering graphics images to virtual mem- 
ory as well as to on-screen memory 



Visual Type. The color map capabilities ol a given display Com- 
mon visual types supported on HP displays include 1-bit static 
gray (or monochrome), 8-bit pseudo color (having 256 color map 
cells of RGB values), and 24-bit direct color (using 8 bits each 
for red green, and blue values) 

Window. An on-screen rectangle of raster data that can be 
mapped (displayed), unmapped (removed), and rendered to 
XDI. X driver interface A set of entry points that exist in the 
device dependent section of the X server, which provide an 
interface between the server's translation module and the X dis- 
play drivers 

X Client. A program that interacts with the X server through one 
of the X libraries using the X client/server protocol 
X Protocol. The specification from the MIT X Consortium that 
precisely defines the behavior of the X server in its treatment of 
clients, its handling of events and error conditions, and its render- 
ing operations. 



Managing and Sharing Display Objects 
in the Starbase X11 Merge System 

To allow Starbase and X to share graphics resources, a 
special process called the graphics resource manager was 
created to manage access to the shared resources. An 
object-oriented approach was taken to encapsulate these 
shared graphics resources. 

by James R. Andreas, Robert C. Cline, and Courtney Loomis 



ONE OF THE CHALLENGES for the Starbase/Xll 
Merge project was designing an architecture that 
supports sharing of resources among X and Star- 
base applications. These HP-UX processes can realize sig- 
nificant memory savings by sharing resources such as 
character sets or fonts. X and Starbase also compete for 
private use of display resources. The architecture we de- 
veloped, called the graphics resource manager, or GRM. 
supports the allocation of shared resources and at the same 
time provides use of display resources by individual pro- 
cesses. 

The GRM consists of an HP-UX process and a library. 
The GRM library is linked with the X server and Starbase 
applications and calls are made to the GRM library to com- 
municate with the GRM process. Fig. 1 shows the GRM 
architecture discussed in this article. The GRM handles a 
request it receives from the library and returns a response 



to the library. The library unpacks the response and returns 
the information to the caller. The GRM supports three 
modes of operation: 

■ The X server operating alone 

■ A Starbase application operating alone without the sup- 
port of any window system 

■ The X server with a Starbase application running in a 
window. 

What Is Managed? 

When we began investigating the GRM architecture, we 
assumed that we would be allocating two basic resources, 
shared memory and offscreen memory. Shared memory is 
a memory resource supported by HP-UX 12 which can be 
attached to the address space of multiple processes. Each 
process can access the shared memory space directly. By 
using shared memory in the GRM architecture, one process 
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Fig. 1 . The architecture ot the graphics resource manager 

can load character font information into shared memory, 
and another process can later use the font. 

Offscreen memory is a region of the display frame buffer 
that is not visible on the display screen. The frame buffer 
is the video memory of a display device dedicated to main- 
taining the value of the pixels. The X server and Starbase 
drivers use offscreen memory to optimize a variety of ren- 
dering operations. Many of HP's graphics hardware prod- 
ucts provide offscreen memory in various shapes and sizes. 
Fig. 2 shows an example of the frame buffer memory avail- 
able in the HP 98550A Color Graphics Board. The block 
mover hardware can be used to copy areas of the offscreen 
memory into visible memory. Font glyphs, which define 
the pixels to be turned on for a particular character font or 
set, are generally loaded into offscreen memory so that the 
block mover can be used to render the glyphs to a window 
at very' high speeds. Pixmnp patterns are also loaded into 
offscreen memory so that the block mover can be used to 
paint areas of the screen using the pixmap pattern (this is 
how a window background is painted). A pixmap is an 
array of pixel values (numerical values typically between 
0 and 255) that determine the color of individual pixels. 

Offscreen memory is limited to the size provided by the 
display hardware. Additional memory cannot be allocated 



by the system, and so the allocation of offscreen memory 
must be done carefully. Other processes can obtain off- 
screen memory for the storage of unique pixmaps. The 
pixmaps can subsequently be used for rendering opera- 
tions, such as a tile to fill a polygon, a background pattern 
for a window, or an image used frequently in a program 
(e.g.. a pushbutton outline). 

Object-Oriented Approach 

When it came to deciding how to implement the alloca- 
tion of shared and private resources for clients, we decided 
to use an object-oriented approach and encapsulate the 
resources in objects. 3 The first thing we did was to identify 
the items we wanted to treat as objects. We identified three 
types of graphics resource manager objects and their attri- 
butes. 

■ Shared memory objects, which are used to share fonts 
or information about some aspect of the display or system 
state. 

■ Offscreen memory objects, which are used to reserve an 
area of the offscreen memory resource. 

■ Semaphore objects, which are used to share a system 
semaphore. The semaphore helps synchronize various 
processes.' 1 

The attributes of GRM objects are divided into two groups, 
general attributes and specific attributes. The general attri- 
butes of a GRM object are a set of fields that define the 
object's name. These fields are consistent among all GRM 
objects. The following shows the name fields for a GRM 



object. 






int 


class, 


f class ot object, client defined ' 


devj 


screen; 


" screen device 


int 


window, 


"X window id'/ 


char 


name|GRM MAX NAME.LENGTH); f string Identifier 




ot object ' 




devj 


device. 


• disk device for fonts V 


int 


mode, 


• mode of a font '. 


int 


key; 


' key of a font"/ 


int 


partition; 


" partition of offscreen memory */ 



The name of a GRM object is a conjunction of all the 
fields. Two objects may differ by as little as one value in 
any one of the fields. 

Object-specific information is added to the instance of 
an object. For example, a shared memory object includes 
the specific size of the object and its specific location in 
the GRM shared memory segment, and an offscreen mem- 
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ory object is described by its specific width, height, and 
depth, as well as its specific location in three dimensions 
in offscreen memory. 

Operations on Objects 

The GRM supports a set of operations that can be per- 
formed on the objects in a consistent way. 

■ GrmCreateObject. The GrmCreateOb|ect function allocates an 
object of the requested class with the object instance, 
allocates the requested resource, and adds the client to 
the list of clients that are using the object. If the object 
already exists or cannot be created, the GRM returns an 
error. The client may then share the object, if it desires, 
by calling the GrmOpenObjeci function. 

■ GrmOpenObjeci. If the described object already exists 
(from calling GrmCreateOb|ect], the client is added to the 
list of clients that are sharing the object. The GRM then 
passes the object's attributes back to the client. If the 
object doesn't exist, the GRM returns an error. 

■ GrmCloseObject. The GrmCloseObiect function causes the 
GRM to delete the client from the list of clients thai are 
sharing the object. When all clients have lost interest in 
an object, the object is destroyed, and the object's re- 
sources are freed. 

Each function is an atomic operation because no other 
operation is allowed to be performed while one is in pro- 
gress. As the project progressed it became necessary to 
group several of these Operations into one large atomic 
operation. Functions were added to mark the beginning 
and end of these larger transactions. 

The GRM also supports a function to find and list the 
objects it has created. To query the existence of sets of 
objects, the client can supply an object name with the fields 
set to match the value fields in other objects. This function 
is primarily used for debugging purposes. 

Design and Implementation 

The project teams investigated three main architectures 
to determine the best design: 

■ Build the GRM into the X server. One of the first architec- 
tures we examined was building the GRM functionality 
into the X server. In this architecture, the Starbase pro- 
grams would communicate with the X server to allocate 
resources. Fig. 3 shows this architecture. We did not 
choose this architecture for several reasons. One reason 
is that the X server is used primarily as a rendering 
engine. The X server could be busy for many seconds 
performing a rendering request, causing the Starbase 
client to block until the X server could process a request. 



Also, GRM functionality would become dependent on a 
particular software technology in the X server, which 
may change as enhancements are made to X. Another 
problem occurs when a Starbase application is running 
alone in raw mode. The X server would have to be exe- 
cuted to support the Starbase client, even though the X 
Window System operation was not desired. 

■ Construct the GRM as a library. The second architecture 
the team examined implemented the GRM as a library, 
which could be linked into the X server and Starbase 
clients (see Fig. 4). Resource allocation would be per- 
formed by the library with multiprocess communication 
done through a single shared memory segment. With 
this scheme, allocation of objects could be done very 
quickly. The allocation operation would consist of di- 
rectly manipulating data structures in the shared mem- 
ory segment. This model was not chosen because of con- 
cerns about its ability to support future upgrades, and 
because it relies on consistent operation among all im- 
plementations that manipulate the shared memory infor- 
mation. We felt that we could achieve more robustness 
by choosing a protocol-based communications model. 
To support future version changes in this model, the 
data structures would have to be designed with built-in 
flexibility and version information. Proving that a newer 
version of the GRM library would work properly with 
older versions and vice versa would have been very dif- 
ficult. 

■ Form the GRM as an independent process. After consid- 
ering the previous two models, the project team settled 
on implementating the GRM as an independent process. 
The independent process model is shown in Fig. 5. The 
independent process model provides logical isolation 
between the GRM and its client processes (the X server 
and Starbase processes). The GRM process is free to define 
its data structures for allocating objects without worrying 
about access to these structures by the client processes. 
This architecture also enabled the designers of the GRM 
to be flexible in the algorithms used to allocate the objects 
without worrying about backwards compatibility with 
previous versions of the GRM. The protocol between the 
GRM and its clients is also typed with a version number, 
and the protocol data structures are padded to maximize 
the potential upgradability of the GRM services. 

Interaction with X and Starbase 

Communication with the GRM is originated by either 
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the X server or the Starbase display driver. The CRM pro- 
cess works by receiving a request, processing the request, 
and then returning a reply message to the requester. An 
application can perform both X library calls and Starbase 
library calls. This results in activity by both the X server 
and the Starbase driver. To get their work dune, these GRM 
clients can call functions in the GRM library to create or 
open objects. The operation is synchronous because the 
client is blocked until the operation is completed by the 
GRM. The GRM library packages the client request and 
sends the request to the GRM process. The GRM process 
processes the request, and if it is asked to create an object, 
allocates the resources for the object. The client is then 
added to the list of clients referencing the object Finally, 
the GRM process returns a reply, which is received by the 
GRM library. The GRM library unpacks the reply and re- 
turns information describing the object to the caller. 

The GRM process never directly modifies data in the 
GRM shared memory segment or in the display hardware. 
The GRM process instead acts upon an "abstract view" of 
these resources. The GRM maintains a data Structure rep- 
resenting the available resources in the GRM shared mem- 
ory and the display hardware offscreen memory (see these 
data structures in Fig. 1). When the GRM process allocates 
an object, it updates the associated data structure. 

Allocation of Offscreen Memory 

Currently, all HP display devices supported by the HP 
9000 Series 300 and Series 800 product lines provide an 
offscreen memory resource. This memory is configured on 
the device as an extension to the memory used to hold 
viewable information on the display. Since display mem- 
ory has a width, a height, and a depth, the offscreen memory 
also has these dimensions. This complicates the sharing 
of this memory because the GRM memory manager must 
allocate three-dimensional objects. Offscreen memory is 
relatively easy to manage if only one process wishes to 
display data on the screen at a time. However, in the Star- 
base/Xll Merge architecture, multiple applications share 
the display device, so managing the sharing of the offscreen 
memory resource is quite a challenge. On some HP display 
devices, pixmaps of varying depths can be allocated. Also, 
some display devices require thai the pixmaps be aligned 
on pixel boundaries for efficient access. The challenge is 
to be able to allocate an arbitrarily sized and aligned three- 
dimensional box out of an arbitrarily sized Ihree-dimen- 





Available Space 



Size ot Block 
to Be Allocated 



sional box of free space. In addition, the algorithm must 
efficiently deal with the resulting free space for future al- 
locations. 

Three-dimensional objects are typically perceived as 
spheres and polyhedra of various shapes and sizes. Pix- 
maps are represented as three-dimensional objects as six- 
sided blocks. A pixmap generally has a uniform width, a 
uniform height, and a uniform depth. The GRM algorithm 
addresses just such pixmaps. 

The Two-Dimensional Case 

Three-dimensional allocation is best explained as an ex- 
tension of the two-dimensional case. The following discus- 
sion of the two-dimensional case will show thai the addi- 
tion of a third dimension is a fairly simple extension of 
the two-dimensional philosophy. 

We start with the two boxes shown in Fig. 6. Box A 
represents the available memory resource and box B is the 
space to be allocated out of box A. If box B is placed inside 
box A. the rest of A can be divided into any of the config- 
urations shown in Fig 7. 

Configuration 1 produces a lot of fragmentation of the 
free space. This fragmentation alone is enough to discount 
it as a viable option. This leaves configurations 2 and 3. 
There is only one difference between these two configura- 
tions and that concerns how memory is globally allocated. 
With configuration 2. free space is cut into vertical strips 
which results in memory being allocated in vertical strips, 
and in configuration 3. free space is cut into horizontal 
strips which results in memory being allocated in horizon- 
tal strips. In general, it makes little difference which con- 
figuration is chosen. For software used on Hewlett-Packard 
workstations, there is a reason to use horizontal strips. 
Fonts are stored as horizontal strings of characters. Since 
caching fonts is a major use of offscreen memory, configura- 
linn 3 was chosen as the optimal solution. 

Adding a Third Dimension 

Adding a third dimension to this problem means taking 
the two-dimensional view and adding the concept of a 
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Fig. 6. Two-dimensional allocation ol offscreen memory Box 
A is the available memory and box B is Ihe space to be 
allocated from box A. 
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Fig. 7. Different memory allocation configurations possible 
when space for a box is allocated trom a larger box. 
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front and a back to the object being allocated (see Fig. 8a). 
As with the two-dimensional model, there are a few ways 
to handle breaking off front and back pieces to make effi- 
cient use of the resulting space. Each method results in six 
free blocks and one allocated block out of the original block 
of memory. To coalesce the blocks when an allocated block 
is freed, the CRM associates the free blocks resulting from 
an allocation with the allocated block (see Fig. 8b). With 
this scheme, when the originally allocated block is freed, 
the blocks that can coalesce with it are easily found. 

The allocated block forms a node of a tree, with the leaves 
of the tree initially being free blocks. New requests for 
offscreen memory cause one of the surrounding blocks to 
be allocated, with the result being that the new allocated 
block becomes a node with a new set of leaf blocks showing 
the free areas— that is. the tree grows (see Fig. 9). As blocks 
are freed the tree shrinks as leaves are coalesced with parent 
nodes. For efficient access. theGR.VI mainlainsa list of Free 
blocks. This list optimizes the search for the best-sized free 
block to satisfy an allocation request. 

The GRM Daemon 

The purpose of the GRM process, or daemon, is to manage 
the allocation of graphics display hardware resources for 
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all processes that want to use these resources. As such, it 
maintains a comprehensive list of the resources thai have 
been allocated lo these processes. The GRM daemon can 
only perform this task correctly if it can be certain that 
there is only one GRM daemon process that is allocating 
resources to all applications requesting them. 

Typical daemon processes are started by an initialization 
script at system boot time. In this situation, uniqueness of 
a daemon process can be easily assured by avoiding mul- 
tiple invocations of the script that starts the daemon pro- 
cess. However, the situation for the GRM daemon is differ- 
ent because the GRM daemon is not started at boot time. 

Since the GRM daemon has a specialized purpose, it is 
preferable lo have it executing only on an as-needed basis, 
rather than running continuously as would be the case if 
it was started at hoot lime. The GRM daemon is therefore 
designed to be spawned only by a process thai requires 
access to the display hardware. Of course, it is only neces- 
sary to spawn a GRM daemon process if one has not already 
been put into service by another graphics application. 

The design of the Starbase/X1 1 system dictates thai the 
X server and all Starbase applications absolutely depend 
on the proper functioning of the GRM daemon. As such, 
the design of the GRM daemon required a foolproof method 
to ensure that for a particular host system, exactly one GRM 
daemon is given the task of mediating the use of all display 
hardware associated with thai host, even when two or more 
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Fig. 8. (a) Two-dimensional views ol an allocated block with 
front and back added (b) Data structure representation ot 
an allocated block with the six tree blocks from the original 
block ot memory 



Allocated 

Block Y 





Top 


1 


Bottom 



Fig. 9. Allocation ot a new block. The new allocated block 
becomes a node with six leaves, which represent free blocks. 
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applications attempt to spawn a GRM daemon simultane- 
ously. 

The HP-UX Semaphore System 

A simple solution to the problem of guaranteeing unique- 
ness for the GRM daemon process is to use a semaphore 
to ensure that only a single daemon has permission to 
continue as the resource manager. Potentially, several GRM 
daemon processes could be starting simultaneously, each 
trying to lest and set the GRM daemon semaphore. The 
GRM semaphore mechanism ensures that only one of those 
processes actually succeeds in the test and set operation, 
w ith the remaining processes being obligated to recognize 
that another GRM daemon process is principal and to exit.* 

Using a system semaphore to implement this scheme 
would have been trivial had it not been for a limitation in 
the behavior of the HP-UX system semaphore during the 
creation of a semaphore.* * This limitation is that the value 
of a semaphore after its creation is not defined. 

While the operating system does provide an atomic op- 
eration for creating a system semaphore exclusively (the 
operation succeeds only if the semaphore does not already 
exist), it does not guarantee the state ol the newly created 
semaphore to be any particular value. Therefore, a process 
can know that it has created a previously nonexistent sys- 
tem semaphore and that it must initialize the value of the 
semaphore, but a separate process cannot know that a given 
semaphore has just been created and is not yet initialized. 
Since the creation of a semaphore and the initialization of 
its value is a two-step process, it is conceivable that another 
process might attempt a semaphore operation between the 
creation and initialization steps. For an application such 
as the GRM daemon, this limitation presented a severe 
problem that required a substantial workaround. 

The problem with the system semaphore can be clarified 
with an example (see Fig. 10). Consider the situation where 
two GRM daemon processes (process A and process B) 
have been started and they are both attempting to create 
and then test and set the GRM semaphore. Suppose that 
process A successfully creates the GRM semaphore. Before 

'In the contnxt nt thin article a semaphore can have a value of zero 01 one A semaphore 
is initialed lo a value ol zero A tesl and set operation on a semaphore succeeds only 
when tne value at the semaphore is initially torn A successful tesl and set operation results 
in the semaphore's having a value ot one The process of testing and setting the value of 
the semaphore is said to Do an atomic operation meaning mat the operation is mdivsiDie 



process A has had a chance to initialize the value of the 
semaphore it is preempted by the kernel's scheduler. Pro- 
cess B then comes along, notices that the GRM semaphore 
already exists, and attempts a test and set operation on the 
semaphore which currently has an undefined value. The 
test and set operation may or may not succeed depending 
on the random value of the semaphore. However, if it does 
succeed, process B will think that it has been designated 
as the principal GRM daemon and carry on as such. Mean- 
while, process A has regained the processor and proceeds 
to initialize the value of the GRM semaphore, overwriting 
the effect of the test and set operation of process B. Sub- 
sequently, process A will successfully execute a test and 
set operation on the GRM semaphore resulting in two GRM 
daemon processes running when there should only be one. 

Various workarounds to the semaphore initialization 
problem were attempted, but none of them that exclusively 
used system semaphores would work because it could not 
be determined whether or not the value of a semaphore 
was valid. A colleague who had experienced similar prob- 
lems with system semaphores suggested that a file lock*** 
could be used as the GRM semaphore. Besides being used 
to control access to a file, file locks can be used in an 
advisory capacity in much the same way as a system 
semaphore. File locks have the advantage that the test and 
set operation does not require the two-step (not atomic) 
"create and initialize" procedure used by system 
semaphores. However, file locks can be difficult to manage 
when the file being used as the subject of the lock, that is. 
the lock file, is not writable or is transitory. As such, the 
GRM daemon uses a file lock only as a means to control 
access to the system semaphore, and the semaphore is re- 
sponsible for awarding a single GRM daemon process the 
guarantee of uniqueness. 

The GRM Daemon Semaphore System 

As mentioned earlier the purpose of the GRM daemon 
semaphore system is lo ensure that exactly one GRM 
daemon process successfully claims responsibility for man- 
aging the allocation of the display hardware. The system 
must be reliable in the face of an arbitrary number of com- 

"The HP-UX system semaphore conforms lo Ihe AT&T UNIX System V Definition 

*"A tile lock lo a file system semaphore associated with 3 segment of a particular file 

winch is referred to here as the lock file 
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(1) For this example, the GRM semaphore has an incidental value ot zero following 
its creation. In general, an HP-UX semaphore has a random initial value. 

(2) Since the GRM semaphore had an initial value ot zero, the test and set operation 
succeeds. 

(3) The effect of the test and set operation of GRM daemon process B at lime T3 Is 
nullified by the initialize operation 

(4) Each of two GRM daemon processes has successfully tested and set the GRM 
semaphore. The semaphore thus falls to allow only one process to continue 
as the principal GRM daemon. 



Fig. 10. Timing diagram ol a 
semaphore failure at initialization 
ol two processes 
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peting infant GRM daemon processes (processes thai have 
not yet established their principal status). The system must 
not require user intervention even in the face of an ungrace- 
ful exit by a CRM daemon process. 

Given these considerations the GRM semaphore system 
was designed to accommodate the following situations: 

■ Any process killed without opportunity for a graceful 
exit. This means that the design must be able to recover 
when the CRM system semaphore and/or lock file are 
left around after the process that created them is termi- 
nated by other than programmatic means. 

■ An arbitrary number of infant CRM daemon processes 
attempting to claim principal (unique) status simultane- 
ously. 

■ An existing CRM daemon process holding the GRM 
semaphore while in the process of exiling. 

A GRM daemon process has three phases. Its first phase 
is during the initialization of an application requiring the 
services of a GRM daemon. The second phase is during its 
altempl to set the GRM semaphore and claim principal 
status. The third phase is the operational phase, when it 
is assured uniqueness and carries out the tasks required of 
the display hardware resource manager. 
The Application Phase. StarbaSe applications and the X 
server must have an executing GRM daemon to function. 
During initialization, a GRM library routine within these 
programs attempls lo make a connection with the GRM 
daemon through its designated sockel address. 1 ' If it fails 
lo make the connection, the routine assumes thai there is 
no GRM daemon process executing and it spawns a GRM 



daemon process. The spawned process "daemonizes" itself 
(detaches from any terminal or parent process), sets its user 
identification number, and then attempts to establish itself 
as the only GRM daemon process. 

Claiming Principal Status. Immediately after an infant 
GRM daemon process is daemonized. il proceeds in its 
attempt to become the only GRM daemon process. Fig. 11 
shows the timing diagram for two processes (processes B 
and C) Irving to claim principal status and control of the 
GRM semaphore. 

The first step is to test and set the file lock, thereby 
claiming exclusive access to the GRM semaphore. In this 
way. if the semaphore does not already exist then the GRM 
daemon can create the semaphore and initialize its value 
without fear that another GRM daemon process may be 
trying to access the semaphore at the same time. The lock 
file, which is used as the subject for the file lock, must be 
created if it does not already exist. If the file already exists, 
either another process is trying lo access the GRM 
semaphore or a process was killed while attempting such 
access. 

The next step is to see if the GRM semaphore exists. If 
the semaphore already exists, then it is known lo have a 
valid value. This is true since any GRM daemon process 
that created I he semaphore is by convention guaranteed to 
have initialized its value before releasing Ihe file lock. If 
the GRM semaphore does not exist, then it is created and 
initialized with a valid value. Since the process is holding 
the file lock, it need not worry about another process at- 
tempting to test and set or initialize the value of the GRM 
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(1) GRM daemon process A holds the GRM semaphore. 

(2) GRM daemon process B holds the GRM file lock. 

(3) GRM daemon process C holds the GRM tile lock. 

(4) The test and set semaphore operation includes the creation of the semaphore if it 
doesn't already exist. The creation, testing, and setting of a semaphore can 
be considered to be an atomic operation since all of these operations are 
executed while holding the file lock and only one process can be holding the file 
lock at a given time. 

(5) GRM daemon process C holds the GRM semaphore. 

(6) A retry cycle includes setting the file lock, testing the semaphore, releasing the 
file lock, and a short sleep. 

(7) Concede and exit (i.e., time out) after enough time has elapsed during the retry 
cycle to allow an existing GRM daemon process to service a disconnect request 
from its last client, free various resources, remove its listening socket, and 
remove the GRM semaphore. 

Fig. 11. Timing diagram for the GRM semaphore system. 



18 HEWLETT-PACKARD JOURNAL DECEMBER 1989 



© Copr. 1949-1998 Hewlett-Packard Co. 



semaphore. 

Once the existence of the GRM semaphore is established 
and it is known to have a valid value, an attempt can be 
made to test and set the semaphore. If successful, the 
semaphore is then held by the process that set it. Once the 
semaphore is set. the lock file can be removed, allowing 
other processes to create a new lock file in order to access 
the semaphore. The life of the lock file is generally limited 
to the duration of the creation and initialization of the CRM 
semaphore. 

If the test and set or any of the preceding operations is 
not successful, then the file lock must be released to provide 
other infant CRM daemon processes the opportunity to 
access the GRM semaphore. After the file lock has been 
released, the infant GRM daemon process will sleep for a 
short period of time and then retry the entire procedure. 
The sleep duration is short enough to expedite the GRM 
daemon startup procedure. However, the retry loop results 
in a delay that is long enough to ensure that there is enough 
time for an exiting GRM daemon to finish its exit and clear 
tbe GRM semaphore. 

Operations Phase. Once established as the principal GRM 
daemon process, the GRM daemon goes about initializing 
its dala structures and opening its listening socket to begin 
serving its purpose. One or more GRM clients will then 
make connections to the GRM daemon and request display 
hardware resources as needed. When a GRM client exits, 
its connection with the GRM daemon is closed and the 
resources allocated to it are freed and made available to 
other GRM clients. 

When the GRM daemon detects the absence of its clients, 
it removes the listening socket, removes the semaphore, 
and then exits. Any GRM client that may be starting up at 
this time will fail to establish a connection, which includes 
verifying the connection with a full handshake, and it will 
start the process over again by spawning a new GRM 
daemon. 



Conclusion 

The GRM provides a means for allocating a system's 
display resources among various competing clients. The 
GRM also provides a means of sharing information among 
the clients through the encapsulation of the information 
in objects. One client can access an existing object if it 
knows the name of the object . even if the object was created 
by another client. The client can access the data by asking 
the GRM to open the object. The GRM also provides a 
sophisticated memory allocation mechanism for the scarce 
offscreen memory resource. The mechanism includes a 
means to coalesce freed fragments of offscreen memory fur 
reuse. Finally, the design of the GRM interface ensures that 
only one GRM daemon process runs on a given system, 
even though several clients initiate access to the GRM 
simultaneously. 
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Sharing Access to Display Resources in 
the Starbase/X1 1 Merge System 

The Starbase/X1 1 Merge system provides features to allow 
Starbase applications direct access to the display hardware 
at the same time X server clients are running. There are also 
capabilities to allow sharing of cursors and the hardware 
color map. 

by Jeff R. Boyton, Sankar L. Chakrabarti, Steven P. Hiebert, John J. Lang, Jens R. Owen, Keith A. 
Marchington, Peter R. Robinson, Michael H. Stroyan, and John A. Waitz 



HP'S GRAPHICS DISPLAY HARDWARE provides 
many display resources thai must be carefully man- 
aged to maintain order on the display when compet- 
ing HP-UX processes, such as the X server and Starbase 
applications, are attempting to access the display hardware 
at the same time. The hardware resources that must be 
shared among these processes include the frame buffer 
(video RAM|. cursors, fonts, and the color map. This article 
discusses methods used to allow Starbase applications and 
the X server to share access to this common pool of hard- 
ware resources, and a method called direct hardware access 
(DHA). which enables Starbase applications to achieve high 
performance when accessing the display, while maintain- 



ing the integrity of the X Window System. 

Display Hardware 

Fig. 1 is a block diagram of a typical graphics display. 
This is a generalized model and does not represent the 
implementation of any particular graphics product. Some 
elements are optional — for example, only 3D systems need 
a z-buffer and some low-end graphics systems have no 
graphics accelerator. 

Graphics Accelerator. The graphics accelerator provides 
specialized hardware to perform graphics Operations on 
commands and data from the display driver running on 
the host system. The fundamental job of the accelerator is 
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Fig. 1. Block diagram of a typical 
HP hardware display system. 
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to apply viewing and modeling transforms and light source 
models to the data to convert it into a format usable by the 
scan converter. The scan converter consists of hardware 
for the generation of pixel data that represents polyline 
and polygon primitives. Operations on more than one win- 
dow are supported by the window control logic, and hidden 
surface removal is provided by the z-buffer. The accelerator 
also has responsibility for the control of most other hard- 
ware resources in the graphics processor, such as the con- 
figuration of the frame buffer and color map. 
Frame Buffer. The frame buffer is a specialized (usually 
dual-ported) RAM. Each addressable location in the frame 
buffer represents one picture element, or pixel. Some por- 
tion of the frame buffer is displavable, so its contents rep- 
resent the current image on the screen. Pixel values are 
read sequentially from the frame buffer and converted to 
a video signal by the color map and its associated circuitry. 
The entire frame buffer can be scanned as many as 60 times 
per second to keep a steady image on the monitor. The 
portion of the frame buffer that is not displayed is called 
offscreen memory. Special circuitry called a block mover, 
which is located in the frame buffer controller, is used to 
copy a rectangular region from one place in the frame buffer 
to another. Both the on-screen and offscreen portions of 



the frame buffer are accessible to the graphics accelerator. 

The frame buffer is physically separate from system RAM " 
but it is mapped into the virtual address space of all pro- 
cesses that access it. Therefore, it is possible for several 
processes to have the same physical RAM of the frame 
buffer mapped into their virtual address space (see Fig. 2|. 
This requires that processes must cooperate when making 
modifications to the frame buffer. The methods we use to 
share the frame buffer are discussed later. 
Color Map. The color map is a very high-speed lookup 
table that maps the numbers stored in the frame buffer to 
the actual color values. The user specifies the mapping 
with commands like: the number 5 in the frame buffer 
represents the color a.b.c where a.b.c are the intensities of 
red. green, and blue that must be mixed to create the desired 
color. After looking these intensities up, the color map 
converts them to analog voltages and sends them to the 
monitor. 

'System RAM and frame Duller RAM are Doth components of the physical address spac*? 
The physical syBiem HAM is me 4M 10 4BM Dyles o> DRAM memory purchased with the 
machine The physical frame Duffer memory is video memory mounted on the display 
controller card 
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Direct Hardware Access 

In the X Window System, user processes, or clients, do 
not render directly to the frame buffer. To gain access to 
the frame buffer, clients make drawing requests to the X 
server, which is the only process with access to the frame 
buffer. The server has control and knowledge of the state 
of the frame buffer. However, to achieve maximum perfor- 
mance and functionality, some clients, such as Starbase 
applications, require direct access to the frame buffer. To 
gain direct access to the display in an organized way, a 
client must cooperate with the server. The client must ob- 
tain information from the server about the areas of the 
frame buffer that represent the visible area of the client's 
window and all rendering by the client must be clipped 
to this area, This is done by requesting the server to register 
an existing window for direct hardware access (DHA). In 
response to this request the server sets up mechanisms to 
pass the clipping information to the client and to update 
it as necessary. 

Two methods are used to pass information from the 
server to the DHA client: shared memory and HP X exten- 
sion library calls. Graphics resource manager shared mem- 
ory is used for information that does not change in size, 
such as the cursor state and fonts. Variable-size data such 
as the clip list is passed to the client via HP X extension 
library calls (routines with an XHP prefix). Using shared 
memory for variable information would create shared mem- 
ory fragmentation problems, and the overhead of convers- 
ing with the graphics resource manager (CRM), which man- 
ages the shared memory area used by X server and Starbase 
processes, could cause performance problems. The com- 
munication links between a DHA client and the X server 
are shown in Fig. 3. 

Data Structures 

Fig. 4 shows the data structures in CRM shared memory 
and process private memory that allow direct hardware 
access by Starbase DHA applications. Pictured are the data 
structures that would exist for one window on one screen. 
Multiple windows, color maps, and screens are supported 
and many of the structures shown are replicated in such 
circumstances. The X server and the Starbase processes 
have pointers for accessing the data structures in shared 
memory. The data items shown in Fig. 4 will be referenced 
and explained in later sections of this article. 

Opening a Window 

To allow a Starbase DHA process to be ported to run in 
X with little or no source code changes, it is important that 
the normal gopen() procedure work the same way it does 
when the application is drawing directly to a graphics dis- 
play. 

The following activities occur during a Starbase open 
(gopen()J of an X window: 

■ If it is not already running, the graphics resource manager 
is started so that the Starbase process can access shared 
memory objects resulting from a DHA window registry. 

■ Tests are made to determine if the pathname parameter, 
which names the window, refers to an X window or one 
of the other supported objects of gopen(). 



■ If the object being opened is an X window, the host 
name, the display identifier, and the screen number are 
obtained. If a driver-level socket connection to the server 
for that screen does not exist, one is opened. 

■ If the window is to be an accelerated window, an ac- 
celerator state identifier is generated. 

■ The XHPRegisterWmdowO procedure is called. If it suc- 
ceeds, then a data structure (DHA window object) will 
be created in shared memory that contains the registered 
window information. 

■ The frame buffer is opened and mapped into virtual 
space using the device pathname returned by the win- 
dow registry call. 

■ The registered window object and other DHA shared 
memory objects, such as the DHA screen object, the dis- 
play state, and the X server's cursor state, are opened. 
These data structures are shown in Fig. 4. 

■ The initial Clip list for the window is obtained from the 
server. 

Registering for DHA Access 

An HP X library extension, XHPRegisterWmdowO. was 
added to the server to allow a client to request DHA access 
to a window. The client passes the identification numbers 
of the desired window and the screen containing the win- 
dow. Additionally, the client may request that the window 
be registered for use by a graphics accelerator. Upon receipt 
of the registration request, the server requests the graphics 
resource manager to create a structure in shared memory 
and fill it with information pertinent to the window. In 
Fig. 4 this structure is called the DHA window object. The 
information in the DHA object for each registered window 
includes: 

■ Clipstamp. An integer counter that is incremented 
whenever the clip list for the window changes. This is 
used as a trigger to the client that it needs to obtain a 
new clip list via the XHPGetClipListO library procedure. 

■ nLlsers. An integer value representing the number of reg- 
istrations active against the window. 

■ nAccelerated. An integer value representing the number 
of accelerated registrations active against the window. 

■ windowjd. An integer value representing the server's iden- 
tification number for the registered window. 

After the DHA window object is created, the server passes 
its GRM shared memory identification back to the client. 
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The client obtains access to the DHA window object in 
CRM shared memory and reads the information supplied 
by the server. As the state of the window changes, the 
information in the DHA window object is updated by the 
server. 

A window may be registered for DHA access multiple 
times by the same client or by multiple clients. All registra- 
tions use the same shared memory object (DHA window 
object). A count is kept of the the number of current regis- 
trations on a window. A client terminates a registration 
with the library procedure XHPUnRegisterWindow(). When the 
number of registrations drops to zero, the server requests 
the GRM to delete the shared memory object and the win- 
dow is no longer directly accessible by clients. 

The Clip List 

The visibility and position of the registered window can 
change at any time. The user can partially obscure the 
registered window with another window, move it to another 



area, iconify it. and so on. The clip list is a list of rectangles 
describing the areas of the window that are visible or 
obscured. In Fig. 5a window A is partially obscuring win- 
dow B. Window A is completely visible and its clip list 
consists of only one rectangle. The clip list for window B 
consists of three reclangles. two visible and one obscured. 

The clip list is a dynamic list that (an be as small as one 
rectangle (the window is fully visible) or as large as several 
hundred rectangles. Rather than pass this information 
through shared memory, it is the responsibility of the DHA 
client to request the list via a library procedure. The 
clipstamp. which is created when a DHA client registers a 
window, provides a fast mechanism to notify all interested 
DHA clients when the clip list changes and they need to 
obtain a new clip list. 

Whenever the clip list for a window changes because of 
events such as a window move or stacking order change, 
the server increments the clipstamp field of the DHA window 
object. When the DHA client wishes to render, it compares 
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Fig. 4. The data stiuctures that 
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bles display resource sharing be- 
tween Starbase applications and 
the X server. 
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the clipslamp in the L)HA window object against its local 
copy. If they differ, the client knows the clip list has 
changed since its last rendering operation and it must re- 
quest a new clip list. After making the request, the client 
copies the shared memory value of the clipstamp into its 
local copy for the next time. This mechanism avoids syn- 
chronization problems because no client ever clears the 
clipstamp field. Multiple clients sharing the same window 
merely bring themselves into synchronization with the cur- 
rent clipstamp value. 

To obtain a new clip list, the client uses the library pro- 
cedure XHPGetClipList(). The client passes to the server the 
identification numbers of the registered window and the 
screen containing the window. The procedure returns to 
the client the following information: 

■ x,y. Integer values representing the origin (upper left 
corner) of the window. This value is relative to the origin 
of the screen. 

■ Width. An integer value representing the width in pixels 
of the window. 

■ Height. An integer value representing the height in pixels 
of the window. 

■ Count. An integer value representing the number of rec- 
tangles in the clip list. 

■ Clip List Pointer. A pointer to a list of rectangles con- 
stituting the clip list. 

The DHA client knows the size of the frame buffer and 
where the frame buffer's physical memory is mapped in 
its virtual memory space. By using this information in con- 
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Fig. 5. (a) Two overlapping windows showing their positions 
in screen coordinates, (b) The clip lists lor the overlapping 
windows in window-relative coordinates where XI ,Y 1 = upper 
left and X2.Y2 = lower right. X2 and Y2 are one pixel outside 
of the true boundary to make the mathematics easier. 



junction with the origin and size of the window, the client 
can index into the frame buffer and calculate the memory 
addresses it is allowed to access. 

The union of the rectangles in the clip list covers the 
renderable area of the window. Each rectangle is specified 
by the x.y coordinates of its upper left and lower right 
corners. The values of these coordinates are relative to the 
origin of the window (see Fig. 5b). Each rectangle is marked 
as either visible or obscured. Visible rectangles are visible 
on the screen. Obscured rectangles are not visible because 
they are either obscured by another window or are partially 
off the screen. The client traverses this list, rendering into 
the visible rectangles. If the window has no backing store, 
which is a location in memory for backing up windows 
that become obscured, rendering to the obscured areas is 
discarded. II the window has backing store available and 
the client can render to it, then rendering to obscured rect- 
angles is diverted to the backing store. Backing store is 
discussed in detail later in this article. 

The client can request a clip list in one of three formals: 
YXBANDED. VISIBLE, or OBSCURED. In the YXBANDED format, 
both visible and obscured rectangles are present in the list 
(see Fig. 6). They are split and ordered so that all rectangles 
with the same y-origin will have the same height, thus 
creating bands across the window. Rectangles in the same 
band are sorted by increasing x-origin value. This type of 
ordering can enhance performance when rendering is done 
by filling horizontal scan lines. In the VISIBLE and 
OBSCURED formats, only rectangles of the respective type 
are present in the list. They are coalesced into the fewest 
possible number of rectangles and are not ordered. These 
formats are useful for displays that have hardware clipping 
capabilities. 

A DHA client can use the X facility XSetClipRectangles() to 
restrict rendering to a subset of the window. If the graphics 
context containing the client clipping is specified to the 
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XHPGetChpUstO function, the resultant clip list will be re- 
stricted to that suharea. 

MOMA Windows 

Multiple, obscurable. movable accelerated windows, or 
MOMA windows, refers to the hardware logic in the graphics 
accelerator that provides very fast drawing and clipping 
of multiple windows. The HP 98556A 2D Integer-Based 
Graphics Accelerator and the HP 98732A 3D Graphics Ac- 
celerator contain graphics accelerator engines that use 
hardw are facilities for clipping. When a DHA client wishes 
to use an accelerator to render into a window, it registers 
the window as accelerated. For some devices, such as the 
HP 98556A. this also implies that the server will allocate 
a MOMA hardware clipping state on behalf of the client. 
For other devices, the DHA client allocates the clipping 
state. 

When the clip list for an accelerated window changes, 
the server downloads the new clip list directly into the 
MOMA hardware on behalf of the client. However, there 
may be reasons why the DHA client must also be able to 
load the clip list directly into the accelerator. For example, 
on the HP 98732A 3D Graphics Accelerator, the clipping 
rectangles for only a single window are stored on the de- 
vice. As graphics contexts are swapped into the accelerator, 
appropriate clip rectangles must be loaded into the MOMA 
hardware. When the server is able to maintain the clip list 
state in the accelerator, the accelerated DHA processes are 
able to achieve a steady throughput because they do not 
have to spend time downloading clip lists. 

The server itself does not take advantage of graphics 
acceleration. There are two reasons for this. Currently no 
graphics accelerators render according to all the X specifi- 
cations. More important. HP's accelerators are basically 
first-in. first-out queues — the rendering commands are pro- 
cessed in the order they arrive. Some operations that can 
be performed by HP's advanced graphics devices, such as 
the HP 98732A. can take a significant amount of time for 
X to perform. However, a critical factor in the usability of 
a window system like X is the response time for operations 
such as window moves and creations. If the X server oper- 
ations must wait in line behind a long stream of compli- 
cated graphics primitives, the response time will not be 
acceptable. 

Starbase/Xll Merge Locking Strategy 

Graphics driver software is closely coupled to the 
graphics hardware it supports. The driver routines set 
hardware registers to certain values and then drawing op- 
erations or other actions are started. In a multitasking en- 
vironment such as HP-UX. there may be more than one 
process that includes a graphics driver that needs to access 
the display hardware, and one process may be preempted 
or swapped out at any time, even during the execution of 
driver procedures. To prevent indeterminate results arising 
from multiple processes using the graphics hardware in an 
uncontrolled way, there must be some means of restricting 
access to one process at a time. The permission (or token) 
to use the display must be passed from one process to 
another. 
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1. In slacked screens mode, the other screen on the 
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Fig. 7. Flowchart lor the routine xosPrepareToRender which is 
used to handle locking within the X server 
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One way this might be done is by implementing a token 
that the kernel controls. Only the process that has the token 
would be allowed to access the graphics hardware, and all 
other processes would actually be prevented from access- 
ing the registers. This is not how the problem is solved in 
HP-UX. Instead, all processes are free to access the 
hardware, requiring that a convention be established and 
followed to ensure that only one process gains access to 
the graphics display at one time. The HP-UX kernel helps 
in this matter by providing a token in the form of a software 
semaphore, and by blocking processes that request the 
semaphore while another holds it. Processes that do not 
follow the protocol of waiting to gain access to the token 
are not prevented from changing the hardware registers. 
The special kernel semaphore in the Starbase/Xll Merge 
system is often called the display lock or kernel lock, and 
it locks access to the physical display. 

X Server and DHA Processes 

Since the display lock is a system resource that processes 
contend for. it is a prime candidate for creating the classic 
deadlock problem. A typical deadlock problem was en- 
countered and solved for a situation involving a Starbaso 
DHA process and the X server. A Starbase process might 
gain access to the display lock not only to operate on the 
display hardware, but also to operate on shared memory 
structures associated with the display. In the course of its 
operations, it may need to call one of the standard HP 
extension X procedures to communicate with the server. 
When the server wakes up to service this request, as well 
as any other input it has received, it attempts to get the 
display lock. A deadlock occurs because the Starbase pro- 
cess is waiting for the server to respond, but the server is 
waiting for the display lock. 

To solve this and similar problems in the Starbase driv- 
ers, the calls to X procedures are strategically placed out- 
side of code regions where the lock is held. An interesting 
example of this is the code to fetch a new window clip 
list. As long as a Starbase process running in a window 
does not hold the lock, the X server can process a request 
to change the clip list for the window. However, if the 
Starbase process gets the lock, then it cannot ask the server 
for the current clip list because of the deadlock that would 
result. The code to solve Ibis problem incorporates the 
following algorithm: 

while the clip list is out-of-date 

request a new clip list from the server 
get the display lock 

if the clip list that was fetched is still up-to-date 
then exit the loop — go on to render 
else release the lock — go back around the 
loop again 

endwhile 

Locking within the X Server 

The X server typically processes requests from several 
clients for one or more windows each time it detects that 
there is input to be processed (a wakeup). At some point 
during this processing, before the graphics hardware is 
accessed, the server process must obtain the display lock. 



All access to the hardware in the Starbase/Xll server is 
governed by the routine xosPrepareToRender() and its greatly 
simplified cousin xosLockDeviceQ. The duties of xosPrepareTo- 
Render() are to verify ownership of or claim the display lock, 
remove cursors (Starbase or X) from the area to which the 
server is about to render, and ensure that the X server's 
and X display driver's concept of the current rendering 
state are the same. Fig. 7 summarizes the actions of xos- 
PrepareToRender. xosLockDevice(), as its name implies, only 
performs the locking portion of xosPrepareToRender(). It is 
used when it is desired to lock the hardware but not change 
the display. 

In some places it is difficult for the X server software to 
determine whether the lock is already held. To handle the 
possibility of nested attempts to gain the display lock, each 
X display driver maintains a lock count. When the lock 
count (nesting level) reaches zero, the X display driver 
issues an unlock call to the graphics driver in the HP-UX 
kernel that maintains the semaphore for the locked device. 
Immediately before unlocking the device, the X display 
driver resets the hardware and any software registers it 
might be maintaining to a state consistent with the expec- 
tations of other processes that might access the display. 
Under normal circumstances, this reset is valuable. How- 
ever, in slacked screens mode the reset is disastrous. 

In stacked screens mode one physical display device is 
made into two screens and is opened as two separate de- 
vices. This causes the display driver to maintain a separate 
lock count for each open. If either count goes to zero, the 
physical device will be reset and unlocked. A busy server 
is likely to render to both screens in a single wakeup, so 
locking one half of a stacked screens mode server must 
imply locking the other half. Although the display lock is 
shared, rendering to one half of a stacked screens device 
invalidates whatever Is known about the hardware state in 
the other half. Stacked screens mode is described in the 
article on page 33. 

Since claiming the lock on a device excludes other pro- 
cesses from access to that device, sharing the hardware 
requires that the lock be claimed at the last minute. The 
deferred lock claim avoids holding off direct hardware ac- 
cess clients any more than necessary. This requirement is 
especially critical when running with multiple physical 
screens. There is obviously no need to hold off direct 
hardware access to one screen while the X server is writing 
to another. 

Each X display driver provides an entry point called 
ValidateRenderingState(). This routine ensures that the hard- 
ware, display driver, and server are consistent and set up 
for rendering. Calls to ValidateRenderingState() can be very 
expensive, so care is taken to use it as little as possible. 
The usual reason for calling ValidateRenderingState() is that 
the hardware state is unknown or known to be invalid. For 
example, when the display driver releases the lock, the 
hardware is returned lo its base state, so revalidation of 
the rendering state is necessary upon claiming the lock. 

To minimize the number of times ValidateRenderingState() 
is called, the server keeps a pointer to the the last rendering 
state structure used for each screen. This pointer is set to 
null whenever the lock is surrendered, the cursor changes 
shape, color, or position, or (he attributes of the window 
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change. Any of these changes means that the contents of 
the rendering structure itself have changed. When xos- 
PrepareToRenoerO is invoked, if the new rendering structure 
is the same as the current one. the call to ValidateRen- 
denngStateO may be skipped. 

Sharing Cursors 

In the effort to ensure that Starbase applications running 
in the X Window System have full functionality so that 
user programs can be used in the new environment without 
source code changes, one particular area of Starbase func- 
tionality that proved especially difficult was the implemen- 
tation of cursors in windows. Starbase implements many 
different kinds of cursors, including crosshairs, rubberband 
boxes, and raster cursors. For a Starbase process to draw, 
it must remove the cursors that interfere with the window 
to be accessed, perform the rendering operations, and re- 
place the cursors. The same is true of the X server when 
it needs to render somewhere on the screen. The shared 
drivers effort described on page 7 allows much of the code 
that draws and undraws the cursor to be shared, but there 
is still a lot of logic that had to be carefully designed to 
ensure that the server and Starbase behave correctly in all 
situations. In Fig. 4 the data structure labeled "Cursor 
State" contains a data block for each cursor. 

Cursor removal is complicated by the existence of both 
Starbase and X cursors. These two types of cursors have 
significant differences. Starbase cursors can have multiple 
instantiations — one window can contain more than one 
Starbase cursor. In the X environment only one X cursor 
can exist on the screen. Starbase cursors also differ from 
X cursors in that Starbase cursors are clipped to the win- 
dows containing them. Starbase cursors cannot extend into 
the borders of their containing windows. The X cursor is 
a global entity in that it is never clipped and can extend 
through multiple windows and their borders. To ensure 
that cursor operations are consistent and predictable, all 
the cursors in a window have a stacking order, and no 
cursor can be moved or operated on unless all the cursors 
on top of i! have been removed. The X cursor is always on 
top. 

Because Starbase is allowed to gopen (open) a single win- 
dow many times it is possible for an X window to have 
multiple Starbase cursors in it. A mechanism was added 
to the Starbase display drivers to maintain a list of active 
cursors for a particular window. This list, which is labeled 
"Echo List" in Fig. 4, is located in CRM shared memory. 
The list is traversed before the Starbase drivers do any 
rendering, and in the procedures associated with the XDI 
entry calls RemoveCursor() and ReplaceCursor(), each active 
cursor in the list is removed in the order it is found. When 
a Starbase cursor is activated, the Starbase driver adds it 
to the list, and when the Starbase cursor is deactivated or 
the program dies, the cursor is removed from the list. 

The X display drivers also use functions associated with 
the XDI entry points RemoveCursorsO and ReplaceCursorsO to 
help the X server remove and replace the X and Starbase 
cursors before and after rendering operations. Unlike the 
routines used by the Starbase drivers, these routines accept 
flags to perform selective removal of Starbase cursors, the 



X cursor, or both. This is accomplished without the X 
server s having to know very much about the cursors' rela- 
tive stacking order or other details. Once the X cursor is 
removed, it remains removed until the device is unlocked. 
The principal reason for not replacing the X cursor until 
the last minute is to avoid invalidating the current render- 
ing state. 

Removing cursors in the X server can be an expensive 
process, so care is taken to avoid unnecessary calls to the 
RemoveCursorsO routine. The server keeps a flag for each 
window to indicate whether cursors have been removed. 
Since the cursor removal code in the display drivers only 
removes cursors from visible areas of a window, cursors 
must be removed before changing the clip list in those 
cases where the window situation is being modified (e.g.. 
a window is being moved or iconified). The cursors are 
replaced using the new clip list, thereby drawing them into 
any newly exposed areas. 

Cursor removal is further complicated by Starbase cur- 
sors in reserved planes. On the SRX and TurboSRX display 
systems the fourth overlay plane can be used to hold Star- 
base cursors. The fourth plane is used for cursors by writing 
the cursor color into the top eight entries of the color map. 
Whenever the fourth plane has a one in it. the cursor color 
will be displayed on the screen, allowing the cursor to be 
drawn in the overlays without destroying the color already 
there. Clearing the fourth plane restores the old color. Since 
these cursors need not be removed for normal rendering, 
the RemoveCursorsO routine typically does not remove them. 
In some situations, such as changing the stacking order of 
windows, moving windows, and so on. these cursors must 
be removed because their associated windows may become 
fully or partially obscured. These situations are handled 
by catching them and passing the flag ALL_PLANES to the 
X display driver when (.ailing RemoveCursorsO. Of course 
use of the ALLPLANES flag must be remembered so it can 
be passed lo ReplaceCursorsO when placing the cursors back 
on the screen. 

Sharing Fonts 

The fast alpha/font manager (FA/FM) system is a utility 
package llial Starbase applications use to display raster 
text. This proprietary system was originally developed for 
the HP Windows/9000 and Starbase graphics environment. 
Being an early proprietary system, FA/FM could not take 
advantage of any of the work clone by public domain sys- 
tems such as X. New development for FA/FM. such as the 
creation of new fonts, had to be done by HP. 

During the Starbase/Xll Merge design phase, the design 
team saw the opportunity to remove FA/FM's reliance on 
proprietary fonts and share the font files associated with 
the X Window System. The team set about designing a new 
font loading system that could be shared by both the X 
server and the FA/FM libraries. In addition, the FA/FM 
system was reengineered to render with X fonts. There 
were good reasons to design the new system. By removing 
FA/FM's reliance on proprietary fonts and allowing FA/FM 
to use the same font files as X. we anticipated that FA/FM 
would have a richer set of fonts lo draw from. Whenever 
a new font was distributed for X. it could be used by FA/FM 
as well. X fonts are distributed in a format called Binary 
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Distribution Format, or BDF. This lias become a de faclo 
standard in the workstation marketplace. Font vendors typ- 
ically make their fonts available in BDF format. BDF fonts 
are usually translated by workstation vendors into Server 
Natural Format |SNF) for efficiency in storage and loading. 

We also saw the opportunity to conserve system re- 
sources. While the X server is running, offscreen memory 
and system RAM are heavily used. Therefore, it was de- 
cided that with proper design and engineering, we could 
create a system that allowed both FA/FM and X not only 
to share font files, but also to share the actual fonts in 
virtual memory and offscreen memory. 

The core of the font sharing system is the font loader. 
Early in the design phase it was decided that the easiest 
way to share fonts was to write a single font loading system 
that could be used by the X server and the FA/FM library. 
This shared loader's responsibility is to read the font file 
from disk into shared memory, making the font available 
to requesting processes. 

At the most basic level, the font loader is quite simple. 
When a request is made to load a font, the font loader does 
the following: 

Locate and verify that the font file is a valid X font 
If font is in shared memory 

establish pointers 

return 

Else allocate the necessary virtual memory 
Create a shared memory object in the CRM's 

shared memory space 
Load the font's disk image into the shared 

memory 
Establish pointers 
Return 

The CRM shared memory object created by the font 
loader is the data structure labeled "Font Object" in Fig. 
4. As long as a particular font remains loaded, any further 
requests to load this font will result in the loader's finding 
the font in shared memory because the same code is used 
by Starbase applications and the X server to load fonts. 
This ensures that at no time will there be more than one 
copy of a font in memory. 

There were some additional requirements that had to be 
met for this new technology to be acceptable. 

■ Object Code Compatibility. Even though the font files 
used by FA/FM were changing, we had to ensure that 
programs that used the old technology would still work. 

■ Relinked applications had to work. We had to ensure 
that relinking an application to use the new FA/FM font 
technology did not cause it to break, even though the 
application might contain absolute pathnames of fonts. 
The first requirement meant that whatever was done for 

the reengineered FA/FM system, the fonts that were cur- 
rently used by the old FA/FM system must remain where 
they were in the file system so that old object code refer- 
ences would still function. 

The second requirement could have been met easily if 
not for the first requirement. For example, if an object mod- 
ule that contained a request to load the font usr lib raster 6 x a 
ip.8U was relinked, it had to be able to find a font that was 
in the X font format from this pathname, even though the 



exact file named contained the original FA/FM font file. 
To get around this problem and to satisfy the first require- 
ment, the directory structure used for FA/FM fonts was 
modified. It was decided that any directory that had old- 
style FA/FM font files in it would have a subdirectory 
named SNF. This SNF directory would contain analogs to 
the FA/FM font files, but in the X font format. Fig. 8 shows 
the old and new directories formats. 

With this scheme, all of the old-format FA/FM font files 
can remain untouched, and the modified directory struc- 
ture satisfies the first requirement. 

To meet the second requirement, the method used by 
the font loader to find a font had to be expanded to accom- 
modate the new directory structure. Instead of just accept- 
ing the pathname given it, it had to be able to search a little 
further. Thus, the first step of the loader process was ex- 
panded to: 

Look at the name given 
If valid font file, load it 
else 

insert "/SNF" into path 
look for valid font in this path 
if found, load it 
else error 

With the new font loader, fonts need to be loaded into 
memory only once no matter how many applications are 
using them. Backwards compatibility with the old FA/FM 
system is preserved. The X server and the FA/FM system 
now share the source code to accomplish font loading, thus 
ensuring compatibility and reducing maintenance require- 
ments. 

Sharing the Color Map 

One of the recurring themes of the Starbase/Xll merge 
project was how to make X and Starbase share resources 

usr lib raster 



6x8 
Ip.BU 
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usr lib raster 
6x8 
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lp.8U.snt -« x Font 
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Fig. 8. The old la) and new (b) directory structures used to 
tind and load FA'FM fonts 
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that they previously believed they each controlled exclu- 
sively. One of these resources that had to be arbitrated was 
the color map and display controls like display enables 
and blink control. 

Notions ot Color Map 

The X concept of a color map was modified quite a bit 
from Version 10 to Version 11 of the X Window System. 
In Version 10 there was a single color map that every client 
allocated colors from. When all of the colors were used, 
the client simply made do with what it had or exited. In 
Version 1 1. the concept of a virtual color map was designed. 
Multiple color maps can be created regardless of how many 
color maps the hardware can support. In fact, every window 
can have a different color map. The color maps get installed 
by a window manager according lo some policy usually 
established by the user. This way. every window can use 
the entire range of colors that a particular display has to 
offer. Fig. 9 illustrates the concept of virtual color maps. 

Slarbase. on the other hand, maintains the single color 
map notiun. Starbase is designed to believe that it is always 
running to a raw display device and that it has complete 
control over the device. It also assumes that there is a single 
hardware color map and that it writes directly to it. 

The Needs 

The solution for the X server sharing the color map with 
a Starbase application was simple. Every time a Starbase 
application opens a window and requests that the color 
map be initialized, a new X color map is created for that 
window. In this way, Starbase applications that believe 
that they have complete control over the color map can 
run without modification. This solution easily takes care 
of the problem of how to emulate a single hardware color 
map with exclusive access for a Starbase application. 

This solution does not answer the question of how Star- 
base applications can read from and write to the color map 
or how Starbase can share the color map with other X 
applications. The first option explored was to have Starbase 
use the standard X color map calls. There were a number 
of problems with this option. Starbase bag 4 different notion 



than X does of how some color maps look. For example, 
in X it is possible lo write only to the red bank of a particular 
color map entry. This is not true of Starbase. For example, 
for 24-bit displays. Starbase looks at the color map as a 
single 256-entry array of RGB values that can only be writ- 
ten as tuples. X views this same color map as three separate 
banks of color maps, representing the red. green, and blue 
banks of entries. 

There was also the question of performance. Some Star- 
base applications use rapid alterations of the color map to 
achieve certain visual effects. Using the X color map mech- 
anisms, the overhead of X server communication might 
prove to be a bottleneck for performance. 

Finally, Starbase allows the manipulation of more than 
just the values of the color map. The shared memory version 
of the X color map includes additional attributes such as 
the display enable, color blinking, and color map blending. 
Also, information about transparent colors is included in 
overlay plane color maps. None of this information can be 
manipulated using the standard X color facilities. 

The Solution 

The design team agreed that Starbase's needs were 
beyond the capabilities provided by X and a new approach 
was needed. The approach finally agreed on was for each 
X color map lo have an analog that the X and Starbase 
display drivers dealt with called a display state (see Fig. 
10). These display stales are created in shared memory 
every time a new X color map is created, and they can be 
manipulated by X or directly by Starbase clients. As infor- 
mation is written to the display state by a display driver, 
the display state is checked lo see if it is installed in the 
hardware. If it is, then the hardware values and the display 
stale are changed. If not. then only the software values in 
Ihe display slate are changed. 

Since the display state is in the shared memory and is 
managed by the graphics resource manager. Starbase appli- 
cations can now manipulate it. Now when a Slarbase appli- 
cations opens a window and requests initialization, the 
driver performs Ihe following operations: 
■ Create an X color map (this operation creates ,i display 



Window 1 mr window 2 [HI Windows *P Window 4 




Ihe Hardware! 




One Hardware Color Map and Associaled Registers 
(Contains Two Types ol Information- 
Display Plane Information and 
Ihe Color Table) 

Fig. 9. Virtual color maps 
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X Server 





Starbase Application 



Color Table 





Display Slate (Shared Memory) 

Fig. 10. The architecture lor sharing the color map in Star- 
base/X11 

stale in shared mem pry). 

■ Associate the color map with the X window. 

■ Establish pointers to the shared memory display state 
data structure. 

o Initialize the display state. 

Whenever a Starbase application makes changes to the 
display state, the Starbase driver does so directly, not using 
the X color map routines. In this way, it is not slowed by 
the overhead of the X server communication mechanism. 
And since the Starbase driver creates its own color map, 
it assumes that it can do anything with it that could nor- 
mally be done with the hardware color map. 

When a window is opened without the INIT* flag, Starbase 
asks the X server which color map is associated with the 
window. It then connects to the display state of that color 
map in shared memory. In this mode Starbase respects the 
restrictions placed on the color maps by the X server pro- 
tocol. For example, Starbase will not change a color map 
cell if the X server has marked it as read-only. The X server 
also does not allow a Starbase program to change the dis- 
play enable register. This allows a Slarba.se application to 
continue to use Starbase library calls to modify the color 
map, but still cooperate fully with other X clients. 

Only one problem was left to resolve: how to communi- 
cate changes made by Starbase applications to the shared 
display state data structure to the X server? An X server 
extension called XHPSynchronizeColorRange was created to 
solve this problem. When a Starbase display driver alters 
the values in a display state, it then calls this extension 
routine. The X server then reads the current values of the 
display state and updates its notion of the color map's 
contents. 



'init is a standard Hag used wi!h gopen inai implies clearing ol the open display planes 
and the initialization ot the color map to Starbase default values 



Backing Store 

The backing store is a piece of memory where the con- 
tents of a window are backed up in case the window gets 
destroyed or obscured by some user action, such as iconifi- 
cation or resizing, or by the action of another program. The 
X server supports backing store on a per-window basis. II 
an X client requests the server to maintain a backing store 
for an window, the server will do so, if possible. 

Fig. 11 illustrates the use of backing store in the standard 
X environment. The contents of a window and its backing 
store are shown in different frames. Assume initially that 
window A is completely visible and has a picture of an 
arrow on the screen. At this stage its backing store is empty 
(frame 1|. When window B is placed on top of window A, 
window A is obscured and the picture in the obscured 
region is damaged. If window A was created with a backing 
store, the server will intervene before the damage takes 
place. When the server realizes that the surface of window 
A is going to be encroached upon by some other window, 
the server saves the picture from window A to its backing 
store (frame 2). When window B is removed, the picture 
in the unobscured region of the window A has to be updated 
(frame 3). If window A has a backing store, the server copies 
Ihe appropriate region from the backing store and recreates 
the picture in window A (frame 4). 

If window A has no backing store, then the only way of 
updating the picture would be to send an expose event 
notice to the client owning window A. The expose event 
tells the client that a region or regions of its window have 

Screen 





B 








B 














Fig. 11. Views ol a window and its backing store In frame 
1 window A is completely visible and baching store is empty, 
in frame 2 window A is partially obscured by window B. In 
frame 3 window A is unobscured and part of the screen 
picture is missing, and in frame 4 the missing part of the 
picture is copied from backing store to window A without 
intervention by the client 
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become exposed, that is. the picture contained in that reg- 
ion may have become inconsistent. If the client chooses 
to. it can update the picture by sending appropriate render- 
ing instructions to the server. In many graphics applica- 
tions, it ends up redrawing all objects in the window even 
though only a small part of the window may have been 
damaged. 

For complex applications, redrawing the entire window 
is a time-consuming event. The standard XI 1 server does 
not guarantee that all implementations will support back- 
ing store. The burden of redrawing a window is left to the 
X clients. All X applications must be knowledgeable about 
expose events and must be able to deal with them. 

The Starbase XI 1 server, however, must provide backing 
store support and cannot depend on the clients' ability to 
deal with expose events, since the Starbase libraries and 
most Starbase applications have no notion of expose events 
and do not know how to handle them. A Starbase applica- 
tion running in an X window would be unable to refresh 
a window after the window became unobscured. so the X 
server must update it from the backing store. The X server 
not only must support backing store, it must also make the 
backing store a sharable object between its own display 
drivers and the Starbase display drivers. Therefore, the 
Starbase/Xll X server employs "smart" rendering func- 
tions to share the backing store between the X and Starbase 
applications. 

HP support of the backing store capability in the X server 
dates back to the days of Version 10 of the X Window 
System. In the HP implementation of the X10 server this 
capability was called the retained raster facility. 

The Starbase/Xll version of the X server operation of 
backing store was guided by two considerations: 

■ Operations involving backing store should be as fast as 
possible. 

■ In a window with backing store a pixel must never be 
rendered twice. If the pixel is in the visible portion of 
the window, it must be rendered on the screen; otherwise 
it will be rendered in the backing store. 

Allocation Policy 

The backing store of a window is always of the same size 
as the window it is backing up. The X server always tries 
to accommodate the backing store in the offscreen frame 
buffer. With the assistance of the display hardware, oper- 
ations on backing store resident in the offscreen frame buf- 
fer are as fast as those on the screen. However, the frame 
buffer is a limited resource, and there will be occasions 
when there will not be enough space in the frame buffer 
for a backing store operation. When this happens, the X 
server will place the backing store in the virtual memory. 

Direct hardware access windows (DHA windows) are 
shared between the X server and the Starbase application. 
If at any time in its life a DHA window is declared to be 
a backing store window, the X server will ask the graphics 
resource manager for a portion of offscreen memory large 
enough to fit the window. If none exists, the X server will 
ask the GRM for a portion of shared memory so that both 
the X server and the Starbase application can render to the 
shared backing store. However, like the frame buffer, shared 
memory is also a limited resource. Thus, there is no guaran- 



tee that sufficient space will be available in the shared 
memory at the time the allocation request is made to the 
GRM. If the GRM cannot provide the needed amount of 
shared memory, the server will declare the DHA window 
to have no backing store. 

MOMA windows are never provided with backing store. 
MOMA windows employ transform engines in the 
hardware to accelerate their rendering performance. There 
is no way to take advantage of the hardware transform 
engines to render to the backing store if the latter is in 
virtual memory. Since we cannot guarantee that the backing 
store will be in offscreen memory, the X server does not 
support baclang store for MOMA windows. Therefore, if a 
window with backing store becomes a MOMA window, 
the X server will dispose of its backing store. 

Smart Driver Functions 

The X server employs smart driver functions to render 
to its drawables. A drawable is a two-dimensional window 
or a pixmap that X and Starbase can draw on and treat as 
a single unit. These driver functions are called smart be- 
cause they can distinguish between different types of draw- 
ables, such as windows without backing store, windows 
with backing store in frame buffer, windows with backing 
store in virtual memory, and pixmaps in virtual memory. 

When a smart driver function is called to render to a 
window, the function can determine whether the window- 
has a backing store. If the window has a backing store the 
driver can determine the location of the backing store, 
which can be in the frame buffer, virtual memory, or GRM 
shared memory. Further, the driver can figure out which 
parts of the backing store represent obscured regions of the 
window. With this knowledge, the smart functions render 
the necessary pixels either on the screen or in the backing 
store. It is never necessary to render to a pixel twice. 

To make backing store sharable between X and a DHA 
Starbase client, the X server HP extension XHPRegisterWm- 
dow() is used to create the backing store object shown in 
fig. 4. The following information is contained in this object: 

■ Drawable Type {drawable. type). An integer flag represent- 
ing the backing store attributes of the window. The val- 
ues indicate whether the window has backing store and 
whether it is located in the offscreen frame buffer mem- 
ory, virtual memory, or GRM shared memory. 

■ Backing Store Stamp (bs_stamp|. An integer counter that 
is incremented whenever the state of the window's back- 
ing store changes. This is a trigger to the client that it 
needs lo obtain new backing store information from the 
shared memory object. 

■ Shared Memory Offset |sm_oflset). A pointer to the start 
of backing store if it is located in shared memory. The 
value of this pointer is an offset relative to the beginning 
of the shared memory segment. The client must add its 
own shared memory base address to determine the true 
absolute address. 

■ Shared Memory Stride (sm.stride). An integer value rep- 
resenting the width of the shared memory backing store 
pixmap in bytes. 

■ Backing Store X Offset (bs_offset_x). An integer value rep- 
resenting the frame buffer x offset of backing store if it 
is in frame buffer offscreen memory. 
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■ Sacking Store Y Offset |bs_otfset y) An integer value rep- 
resenting the frame buffer y offset of backing store if it 
is in frame buffer offscreen memory. 

■ Banking Store Planes |bs_planes). An integer bit mask rep- 
resenting the display bit planes that are managed by 
backing store. 

■ Backing Store Pixel (bs.pixel). An integer representing (he 
value to be placed in the bit planes not managed by 
backing store. 

Deep Backing Store 

Starbase supports 24-plane deep windows. Therefore, it 
was necessary to develop a method for the X server to 
support a 24-bit-per-pixel backing store. The main problem 
was determining how deep backing stores can be organized. 
In 24-plane-deep displays, the frame buffer is organized as 
three memory banks, each eight planes deep. The three 
banks are the red, green, and blue banks. As long as the 
backing store is placed in the frame buffer, there is no 
problem. The RGB components of each pixel are stored in 
the corresponding bank. There is a problem, however, 
when the backing store must be placed in virtual or shared 
memory. 

In the X server, rendering to the virtual memory is done 
using the memory drivers leveraged from the Starbase li- 
brary. There are two main components of the memory 
driver: the bit driver and the byte driver. The bit driver is 
used to draw on one-bit-per-pixel virtual memory pixmaps, 
and the byte-driver is used for one byte-per-pixel virtual 
memory pixmaps. In implementing the tleep backing store 
we could have written a new memory driver for drawing 
to 24-bit-per-pixel virtual memory pixmaps or organized 
the deep backing store so that the existing memory drivers 
could be used without any modification. We chose the 
latter solution (see Fig. 12). 

The organization of the deep virtual memory backing 
store mirrors that of the deep frame buffer. The deep virtual 
memory backing store is organized as three software banks, 
each one byte deep, corresponding to RGB banks in the 
hardware (see Fig. 12c). With this organization we are able 
to use the byte drivers without any change. However, for 
each drawing operation we call the byte driver three 
times — once for each software bank. This organization also 
simplifies the process of copying data from the virtual 
memory backing store to the screen because the data from 
a software bank is simply moved to the corresponding 
hardware bank. 



(a) 




8 Planes Deep 

Red Bank Green Bank Blue Bank for Each Bank 




Fig. 12. (a) A 24-plane-deep window on the screen. Of 
course the physical depth ot display memory is not seen by 
the user, (b) 24-plane-deep backing store in offscreen frame 
buffer memory organized in three hardware banks of 8 planes 
each The picture on the display is replicated on the three 
banks, (cl 24-plane-deep backing store in virtual memory 
This is a contiguous piece of memory organized in three 
compartments. Each compartment is a software bank mirror- 
ing the hardware banks 
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Sharing Overlay and Image Planes 
in the Starbase/X11 Merge System 

Developing a method to take full advantage of the 
capabilities of display memory was one of the challenges 
of the Starbase/X1 1 Merge project. 

by Steven P. Hiebert, John J. Lang, and Keith A. Marchington 



DEPENDING ON THE DISPLAY DEVICE, the X server 
allows users to configure a display in four funda- 
mental display modes, image mode, overlay mode, 
slacked mode, and combined mode (see Fig. 1 ). The display 
mode determines how the hardware display memory is 
used. This article describes the rationale for the different 
display modes and how each of them works. The combined 
mode is discussed in greater detail than the others because 
it is the most sophisticated mode and it is available on the 
TurboSRX 3D graphics accelerator display system, 

HP offers a wide variety of display hardware for its work- 
station products. This display hardware ranges from low- 
resolution monochrome displays to high-resolution dis- 
plays with 16 million colors and 3D acceleration hardware. 
Using the full range of display capability in the display 
hardware was one of the challenges for the Starbase/Xll 
Merge design team. 

One of the underlying philosophies of the X Window 
System is that it provides the tools to build different user 
Interfaces, but it does not enforce any particular user inter- 
face standard. Thus X provides mechanisms, not policy. 
To maintain this philosophy, it was decided that the X 
server would provide the different display modes for the 
X Window System and allow the user to choose the display 



mode most appropriate for the application. 

Overlay and Image Planes 

All display systems for HP's workstations have at least 
one and as many as 24 planes of display memory. In addi- 
tion, some of the more sophisticated display systems have 
additional display memory called overlay planes. The over- 
lay planes are so named because they appear on top of. or 
over, the image planes. For example, if the overlay planes 
of a display are enabled and each pixel is set to black, then 
the image planes would not be visible. Areas of the overlay 
planes must be disabled or made transparent to view the 
image planes. Overlay planes can be set to a transparent 
color so that the image planes can be seen. Existing HP 
displays have from zero to four overlay planes. 

The image planes are used primarily for nuidering com- 
plex images and usually have more capabilities than over- 
lay planes. For example, on the TurboSRX display system, 
the 3D accelerator can ( lip to an arbitrary set of rectangles 
in the image planes, but not in the overlay planes. Overlay 
planes have a number of uses, but primarily they are used 
to display information like text and menus. In this way 




8 Planes 

for Each Color 




4 Planes 



(a) Image Mode 



(b) Overlay Mode 





(c) Stacked Screens Mode 



(d) Combined Mode 



Fig. 1 . An illustration ol the differ- 
ent display modes, (a) Image 
mode. All rendering by X is done 
only to the image planes ol the 
display (b) Overlay mode. All X 
rendering is done only to the over- 
lay planes, and the image planes 
can be used by other applications 
To see what is on the image planes 
Ihe overlay planes would have to 
be made transparent (c) Stacked 
screens mode The overlay and 
image planes are treated as two 
separate screens (d) Combined 
mode Implemented primarily to 
support the display capabilities of 
TurboSRX, the combined mode 
uses Ihe overlay and image 
planes as one screen 
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rendering in the image planes i.s not damaged by menus 
or text, and cosily redraws of pictures in the image planes 
are prevented. With some 3D graphics or complicated 2D 
graphics, such redraws can take many minutes. 

The overlay and image planes are located in the frame 
buffer and. as shown in Fig. 2, each plane is organized into 
on-screen and offscreen memory. 

Image Mode 

Every HP display system supports the image mode and 
all but the TurboSRX will default to image mode if the user 
does not specify a display mode. In the image mode, the 
X server performs all rendering only on the image planes 
available on the display device. If the display device has 
any overlay planes they are set to transparent in this mode. 
See Fig. la. 

Overlay Mode 

The overlay mode i.s almost identical to the image mode, 
except that the overlay planes of the display device are 
used by X rendering calls, and the image planes are free 
to be used by other applications such as Starbase graphics 
applications. A good example of this configuration is the 
HP 9000 Series 300 and 800 SRX (solids rendering acceler- 
ation) display system. The 3D acceleration hardware of the 
SRX is not capable of clipping to window boundaries, so 
it is not useful in a window environment. For the 3D accel- 
eration hardware to be useful, it must have unobstructed 
access to the full, unobscured image planes. To run with 
this hardware configuration, a window-based application 
can provide all user interface components (e.g.. windows 
and menus) in the overlay planes using the X display driver, 
and use the 3D accelerator for more complex rendering in 
the image planes. By creating a transparent window in the 
overlay plane, or by setting the window system's root win- 
dow to transparent, the image planes can be made viewable. 
On the SRX display, this is the only way to use the 3D 
graphics accelerator and a window system such as X at the 
same time. 

Stacked Screens Mode 

In the stacked screens mode the overlay planes are used 
as one screen and the image planes as another (see Fig. 
lc). In this way, the window system has twice as much 
screen "real estate." Stacked screens mode is literally the 
image mode and overlay mode running simultaneously. 
The screens are stacked one on top of the other with the 
visible screen being the one where the mouse cursor is 
located. To get from one screen to the other, the user simply 
moves the mouse off the edge of the current screen. The 
other screen is made visible as the mouse enters it. All of 
the normal capabilities of X are available in both the image 
and the overlay screens, and all of the restrictions of the 
image and overlay modes apply. 

Stacked screens mode is particularly popular with soft- 
ware developers because it is possible to make twice as 
much information easily viewable. This means that a de- 
veloper can have a debugger, terminal emulators, editors, 
code viewers and other applications all running at the same 
time and viewable. 



Combined Mode 

Image, overlay, and stacked screens modes were avail- 
able in the X Window System before the Starbase/Xll 
Merge project. However, the Starbase/Xll Merge project's 
goal was to provide full-performance Starbase graphics in 
X windows wherever possible, and since the TurboSRX 
display, which is the successor to the SRX display system, 
has the hardware necessary to do accelerated graphics in 
windows, this meant that we needed to provide accelerated 
graphics in windows as well. This could have been done 
in image mode on the TurboSRX, but it would not have 
been as aesthetically pleasing. 

The design team decided that a new approach was 
needed for the TurboSRX. This new approach is called the 
combined mode. The combined mode uses all of the planes 
of the display system (both image and overlay) as a single 
screen, making it look to the application as if there were 
simply one contiguous set of planes with a variety of differ- 
ent capabilities (see Fig. Id). Using both the overlay and 
the image planes as a single screen is basically the opposite 
of how stacked mode works. In stacked mode the image 
and overlay planes are treated as two separate screens. 
With the combined mode the capabilities of the TurboSRX 
anil X can work together. 

TurboSRX Capabilities 

Many of the capabilities provided by the HP 9000 Series 
300 and 800 TurboSRX graphics subsystem are also provided 
by its predecessor, the SRX. These capabilities include: 

■ Image Planes. There can bo 8 to 24 planes of image mem- 
ory plugged into the display system. The system can be 
used as an eight-bit pseudocolor device (CMAP.NORMAL 
mode) offering 25G colors simultaneously or as a 24-bit 
color device (CMAP FULL mode) offering over 16 million 
colors simultaneously. 

■ Overlay Planes. F^ach display system has three or four 
planes of memory that overlay {or are in front of) any 
other display memory. The original intention for these 
planes was to use them for floating text, cursors, or 
menus. 

■ Double Buffering. The image planes can be partitioned 
as pairs of banks in a variety of ways for double buffering. 
The most common configurations are to divide them into 
two eight-bit banks in CMAP^NORMAL mode and into two 
12-bit banks in CMAP_FULL mode. 

■ Color Map Mode Hardware. The color map mode hard- 
ware enables the display system to run either in the 
CMAP NORMAL mode or the CMAP.FULL mode. If 24 planes 
of image memory are plugged into the display system, 
in CMAP_NORMAL mode each pixel is interpreted by tak- 
ing the eight-bit pixel value out of the low- bank of display 
memory and using it as an index into a table of RGB 
(red. green, blue) values to determine what color a par- 
ticular pixel on the display should be. In CMAP_FULL 
mode, each of the three eight-bit banks of display mem- 
ory is read to determine which red. green, and blue value 
should be used on the display. By writing to a hardware 
mode register, these modes can be dynamically switched 
and different windows on the display screen can be dis- 
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played in different color map modes. 

■ 3D Graphics Hardware. Both systems have the ability to 
render complex 3D graphics, providing realistic images 
on the display. The front cover of this issue shows an 
example of the realistic images that can be produced 
using combined mode on a TurboSRX display system. 
The 3D images Icar. engine, and gears) are located in the 
image plane, and the other items on the display are lo- 
cated in the overlay plane. 

Capabilities available in the TurboSRX but not in the 
SRX include: 

■ Hardware Cursor. Two planes of memory (in addition 
to the overlay and image planes) are available for the 
display of cursors. This feature allows a hardware cursor 
to be placed on the display without disturbing the con- 
tents of any of the image or overlay planes beneath it. 
The hardware cursor also offers the advantage of not 
having to remove the cursor to render, since it resides 
in its own plane of display memory. Not removing the 
cursor before rendering provides better performance for 
rendering routines. 

■ MOMA Window Support. From the perspective of the 
Starbase'Xl 1 Merge system, this is probably the most 
significant feature on TurboSRX. MOMA (multiple, 
obscurable. movable, accelerated) window support al- 
lows the TurboSRX accelerated graphics capabilities to 
be used in a windowed environment by providing spe- 
cial clipping hardware. This clipping hardware allows 
the TurboSRX graphics accelerator to render only to the 
exposed rectangles of a window. The TurboSRX hard- 
ware has support for a maximum of 32 clipping rectan- 
gles for MOMA windows, which is an adequate number 
for most window systems, but a small number for the X 
Window System, 

With these TurboSRX features in mind, the design team 
focused on designing the Starbase/Xll Merge system to 
take full advantage of the hardware capabilities of the Tur- 
boSRX. This resulted in the following design goals for the 
combined display mode: 

■ Provide support for MOMA windows that would allow 
Starbase applications to use the 3D graphics accelerator 
in X windows. 

■ Support eight-bit and 24-bil color modes. Make H-bil 
pseudocolor and 24-bit color with double buffering avail- 
able to applications. 

■ Maintain the visual aesthetics of the system. When pos- 
sible, minimize the damage that different hardware 
modes and different color maps cause to the appearance 
of the display when they are displayed simultaneously. 

■ Provide a state-of-the-art X server implementation. Rec- 
oncile the capabilities of the X Window System. Version 
1 1 with the capabilities of TurboSRX. 

The Architecture 

With XI 1. a number of new concepts were introduced 
to increase the capabilities of X such that it could be run 
on the entire range of today's display hardware as well as 
any future display hardware that might be developed. The 
concept in XI 1 that is most important to the combined 
mode is called the "visual." The visual is the mechanism 
X uses to describe the capabilities of a particular display's 



hardware. The visual structure includes: 

■ Class. The class describes how a color is mapped from 
memory to the display. There are two major classes, 
static and dynamic, and subclasses of each. The subclass- 
es include gray, mapped color, and decomposed color. 
Static and dynamic classes are defined at X server start- 
up time. Static classes cannot be changed by application 
programs, but dynamic classes are definable and change- 
able in the application program . The gray subclass means 
that all the colors in the color map are shades of gray. 
For the mapped color subclass, one-byte pixel values 
from the frame buffer are used to index into a color map 
of RGB tuples which describe the color to be displayed 
(see Fig. 3a). For the decomposed color subclass, a three- 
byte pixel value is used to get the color value from the 
color map. The first byte is used for red. the second byte 
for green, and the last byte for blue (see Fig. 3b). The 
mapped color subclass allows up to 256 colors and the 
decomposed subclass allows up to 16 million colors. 
Each entry in the color map table represents a color 
intensity (shade). For instance, the value 10 might repre- 
sent dim RGB intensities and 220 would represent bright 
RGB intensities. These red. green, and blue intensities 
are mixed together to produce the displayed color. Put- 
ting these attributes of color maps together (class and 
subclass) allows the device to support up to six types of 
color maps. Table I shows the X color map types. 



Table I 
X Color Map Types 

Subclass Static Dynamic 

Gray StaticGray Grayscale 

Mapped Color StaticColor Pseudocolor 
Decomposed Color TrueColor DirectColor 

■ Color map entries. The number of different color map 
entries available for use by client applications. 

■ Bits of RGB information. How many bits of resolution 
are available to describe red. green, and blue color values. 

Image Planes 




(a) (b) 
Offscreen Memory 

On-Screen (Visible) Memory 

Fig. 2. The organization ol the image planes in the frame 
butler, (a) A display system containing both image and over- 
lay planes (e g . the HP 98550A Color Graphics Board) (b) 
A display system with only image planes in the Irame bullet 
(e.g.. the HP 98547 A) 
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Red Green Blue 




Color Values 
(Intensities) 



(a) Color Map 



Three- 




Fig. 3. (a) Mapped color subclass A one-byte pixel value is 
used to index into the color map to obtain the RGB tuple (b) 
Decomposed color subclass A three-byte pixel value con- 
tains an index tor each primary color list in the color map. 

■ Planes. The number of planes of display memory avail- 
able on the display device. 

XI 1 makes il possible to have more than one of these 
visuals available on a given screen at the same time. With 
multiple visuals, it is possible to create a mode that incor- 
porates the capabilities of both the image and the overlay 
planes of the TurboSRX so that the full range of the dis- 
play's capabilities are available to applications. We decided 
to treat the image and overlay planes as a single screen 
with the overlay planes represented by one three-or-four- 
plane Pseudocolor visual type. The number of planes is de- 
pendent on how the user sets up the device file for them. 
The image planes, with their CMAP.NORMAL and CMAP_FULL 
modes. are allowed to have either an eight-bit Pseudocolor 
visual type, a 24-bit DirectColor visual type, or both simul- 
taneously. Another option allows an eight-bit double-buf- 
fered Pseudocolor visual type for image planes and a 12-bit 
double-buffered DirectColor visual type for image planes. 

In combined mode, the root window for the screen al- 
ways resides in the overlay planes, and the overlay plane 
visual is the default visual for the screen. Any client that 
simply asks for a window to be created with the default 
visual of the screen ends up residing in the overlay planes. 
For an application to create a window in the image planes, 
it has to request the visual information from the server and 



specifically request the desired visual type. 

The color map modes CMAP NORMAL and CMAP_FULL in 
the image planes are handled through virtual color maps. 
Virtual color maps are an image of what the window or 
client thinks I he hardware color map looks like. As was 
described in the article on shared display resources on 
page 20, each color map in the Starbase/Xl 1 Merge system 
has an analog called a display state, which is used by the 
display drivers. Each display state contains the current 
color values for a device's color map, some device-specific 
information about which planes of the display are enabled, 
and in the case of the TurboSRX, the color map mode of 
the hardware. X provides a way for a program to control 
which color map is currently loaded into the hardware 
(this is called validating the color map). Usually a special 
X client, such as a window manager, is the only program 
that changes which color map is loaded (validated). The 
window manager may have several methods for the user 
to specify which color map is loaded. Therefore, when the 
color map for an eight-bit PseudoCcolor window is installed 
in the image planes, the hardware will be switched toCMAP_ 
NORMAL mode, and when the color map for a 24-bit Di- 
rectColor window is installed, the hardware will be switched 
to CMAP_FULL mode. Fig. 9 on page 29 illustrates the virtual 
color map concept. 

The result of this approach is that most windows are 
created in the overlay planes. Most X server clients such 
as window managers and terminal emulators use the de- 
fault visual. Applications that request visual types that are 
in the image planes can change the color map in the image 
planes and use one of the color map modes without affect- 
ing the visual appearance of the windows in the overlay 
planes. Most of this color map control was provided for 
Starbase applications because they usually assume that 
they can change the color map at will. Asa result a Starbase 
application creates its own virtual memory color map for 
a window that it opens. 

This design allows the TurboSRX to be used in windows 
and satisfies all of the design goals for the TurboSRX dis- 
play driver in the X server. With most of the windows in 
the overlay planes, their clipping regions do not have to 
be included in the hardware clip list for the accelerator. 
This helps us live with the 32-clip-rectangle restriction of 
the TurboSRX and achieve the full performance of a Star- 
base application running in X. 

Having most of the commonly used windows in the over- 
lay planes allows combined mode to maintain visual aes- 
thetics at the highest possible level, while still allowing 
both eight-bit and 24-bit windows in the image planes. As 
a counterexample, take the case of image mode. If image 
mode were to allow both eight-bit and 24-bit windows 
simultaneously, one of those two visual types would have 
to be the default. If an application created a window of a 
visual type other than the default and its display state were 
installed, it would change the hardware color map mode 
and all of the windows of the default visual type would 
become incorrect in appearance. In fact, the windows, in- 
cluding the root window, would become completely inde- 
cipherable. However, with combined mode, when the 
hardware color map changes, the windows in the overlay 
plane (where most applications reside) remain visually 
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Fig. 4. When the slacking order is changed, area X becomes 
the newly exposed area m the image plane in the original 
clipping algorithm, besides causing area X to be painted 
transparent m the overlay plane, the image plane is consid- 
ered to be damaged, causing the image plane to be cleared 
to the background color and an exposure event sent to the 
client owning window I The client would then rerender area 
X in the image plane 

correct and only image plane windows become visually 
incorrect. 

This design also provides a very straightforward view 
for an X application. A client application can simply con- 
nect to the server and request windows of the default type 
and get windows in the overlay planes. Or, using the XGet- 
Visuallnto routine, the client application can interrogate the 
server for all of its visuals or a particular visual it is in- 
terested in. The application never worries whether it is in 
overlay planes or image planes. The server automatically 
places the window in the appropriate planes without inter- 
vention by the application. 

Implementation 

The architecture described above fits very neatly into 
the the X model, and for the most part, the implementation 
of combined mode was straightforward. But there were 
some challenges in the implementation that resulted in 
some interesting solutions. The two most challenging areas 
were how to allow the user to see through the overlay plain' 
to the image pliant windows and how to clip windows and 
generate exposures for only those areas of windows that 
were actually damaged by other windows. A window that 
needs exposure is one that is covered up and needs to be 
seen. To see a window that resides in the image planes, 
the overlay planes must be made transparent. At first, creat- 
ing this transparent hole seemed like a difficult task, but 
as it turned out, the X server architecture allowed this to 
be handled quite easily. Whenever an area of a window is 
exposed, the server is required to paint the window's back- 
ground. At this point, the X server determines if the win- 
dow being painted is in the image planes, and if it is, Simply 
makes the same area of the overlay planes transparent. In 
this way all visible regions of the image plane window 
have ■! ' nrresponding area in the overlay planes painted 'i 
transparent color. 



Combined Mode Clipping 

To solve the problem of clipping windows and generating 
exposures for damaged windows, and to make full use of 
the capabilities of the TurboSRX hardware, the clipping 
algorithm used in the X server had to be modified. In the 
original X server, the clipping algorithm made no distinc- 
tion between overlay and image planes when computing 
clip lists for windows. Lacking this distinction, creating a 
window in the overlay planes would cause the server to 
conclude that any windows in the image planes obscured 
by the overlay plane window were damaged. When the 
overlay plane window was moved or destroyed, newly ex- 
posed areas of the image plane window would be cleared 
to the window's background color and an exposure event 
would be sent to the client owning the image plane win- 
dow. The exposure event tells the client that it must re- 
render to the image plane (see Fig. 4|. The modification of 
the clipping algorithm allows windows in the overlay 
planes to be created and destroyed without affecting win- 
dows in the image planes. 

For both clipping algorithms, new clip lists are computed 
whenever an action is taken that could change the clip list 
(e.g.. changing the stacking order of the windows on the 
screen). The function xosVaiidateTreeO is used to compute 
the new clip lists. xosVaiidateTreeO adds the visible portions 
of any children of the parent window to be reclipped back 
into the parent window's clip list and then, passing the 
parent's clip list as the working universe, calls the routine 
xosComputeCHpsO to let each of the parent window's children, 
and the children's children, and so on recompute their clip 
lists. The working universe includes the visible areas of 
the parent window. Upon return from xosComputeCHpsO. the 
working universe is the parent's new clip list. By subtract- 
ing the old clip list from the new clip list the parent can 
compute which areas have been newly exposed. That is. 
any area in the new clip list that is not in the old clip list 
must be newly exposed. 

The modification of the clipping algorithm to support 
combined mode consists mainly of computing two clip 
lists for all the windows on the screen. One set of clip lists, 
which we can call the old-style clip lists, is generated based 
on the unmodified clipping algorithm described above (i.e.. 
these clip lists contain windows from both the image and 
the overlay planes). The second set of clip lists is computed 
taking only the image plane windows into account (image- 
only clip list). Within the X server, image plane windows 
use the image only clip list as the default clip list, and the 
overlay planes use the old-style clip list as the default. 
Both image and overlay plane windows use the old-style 
dip list for cursor removal. Since either type of window 
can have children or subwindows of the other type, win- 
dows on both planes must keep the image-only and old- 
style clip lists available. 

In the new combined mode algorithm, rendering to the 
image plane is done only when there are changes to the 
windows in that plane and not because of changes to win- 
dows in the overlav plane. The image-only clip list is used 
to handle rendering to image plane windows. The old-style 
clip list is used to determine which areas of the overlay 
plane windows must be painted transparent to expose win- 
dows in the image plane. 
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Combined mode clipping allows rendering to an image 
plane window while il is obscured by an overlay plane 
window. Since the root window is always in the overlay 
planes, rendering can even take place to an image plane 
window that is iconilied. The server must take care, how- 
ever, to avoid rendering to areas of iconified image plane 
windows used by other image plane windows. Image win- 
dows that are not iconified are automatically removed from 
the allowable rendering area by the old clipping method. 
Extra programming was required to remove iconified image 
plane windows from the allowable rendering areas of other 
iconified image plane windows. That is, if two iconified 
image plane windows overlap, neither may render to the 



overlapping area. When one or the other of the iconilied 
windows is mapped, it will get an exposure event for that 
overlapping area. 

Conclusion 

Combined mode is a solution to the complex problem 
of how to support a high-end display system in the best 
possible way. Combined mode offers some capabilities that 
allow the Turbo.SRX display system to work at its full po- 
tential in an X environment. With the addition of combined 
mode, the X server now offers four different display modes 
of operation to take full advantage of the broad range of 
display hardware for HP workstations. 



Sharing Input Devices in the Starbase/X1 1 
Merge System 

To provide support for the full set of HP input devices and 
to provide access to these devices for Starbase 
applications running in the X environment, extensions were 
added to the X core input devices: the keyboard and the 
pointer. 

by Ian A. Elliott and George M. Sachs 



STANDARD X SERVERS SUPPORT two input de- 
vices: the pointer (mouse, tablet, light pen, etc.) and 
the keyboard. These devices are known as the core 
input devices. The X server sends information from the 
input devices to client programs in packets called "events." 
The keyboard generates key events, while the pointer gen- 
erates button or motion events. These events contain infor- 
mation that includes the absolute location in two dimen- 
sions where the event occurred, the location relative to the 
X window in which the event occurred, and a timestamp. 
For key and button events, there is also a field that tells 
which key or button was pressed. 

In a typical X environment, multiple application pro- 
grams called clients run simultaneously. Each has its own 
window or set of windows and all share the core input 
devices. The X server arbitrates which client gets a particu- 
lar input event by determining which window has the 
"input focus." The focus window, which is the window 
that is allowed to receive input from input devices, is nor- 
mally either the smallest window that contains the pointer, 
or is an arbitrary window explicitly established as the focus 
window by a protocol request made by a client program. 

We faced two major problems in the area of input device 
support for Starbase/Xll Merge: how to provide the ability 
to use the full set of Hewlett-Packard input devices in an 



X environment, and how to access those devices through 
Starbase in that environment. The first problem arose be- 
cause there is currently no X standard for using other input 
devices in addition to the core devices. If additional devices 
were supported, there is no provision within the defined 
core events for determining which device generated the 
event. There is also no provision in the existing events for 
reporting data of more than two dimensions, or motion 
data whose resolution is different from that of the screen. 
The problem with Starbase was that prior to this project, 
Starbase did not provide a way for multiple programs to 
share input devices. The only input devices that could be 
shared were those for which a window system arbitrated 
the sharing and allowed Starbase input. These devices in- 
cluded the HP Windows/9000 locator and the X Version 
10 pointer and keyboard. 

To overcome these problems the goals established to pro- 
vide sharing of input devices in the Starbase/Xll Merge 
system included: 

■ Support a wider range of input devices including the 
core devices, and ensure that all the devices supported 
have the same functionality as that provided by the core 
devices. 

■ Support all input devices that follow the HP-HIL (Hew- 
lett-Packard Human Interface Link) specification 1 and 
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X Input Protocol and X Input Extensions 



The core protocol ol the X Window System provides a standard 
syntax tor making requests to the X servers The syntax describes 
the sequence o' bytes mat make up each o* the protocol requests 
For example the xSetinputFocus request which allows a client to 
choose which window should receive input trom the keyboarc 
has the following format 

Length Value Meaning 
(bytes) 



1 


42 


XSetinputFocus Request ID 




0.1. or2 


Revert-io-window Parameter 


2 


3 


Request Length [in four-byte 






words) 


4 


0 1 or a Window ID 


Focus-window Parameter 


4 


Timestamp Informalion 


Focus-Time Parameter 



The information in a protocol request like the one above tells 
what request is being made (XSetinputFocus). the length of the 
request (three four-byte words, or 12 bytes), and the values of 
any parameters the request has The parameters in the request 
specify which window should receive input from the keyboard 
Ithe Focus-Window parameter), which window should receive input 
il the locus window disappears (Reveri-to-Window parameter), and 
when the XSetinputFocus request should take effect (Focus-Time 
parameter) The 0. 1 and 2 values m the parameters are special 
constants that indicate no window, whichever window contains 
[he X pointer and whichever window was named as the parent 
of the focus window respectively. 



X was designed to allow rndnnduai vendors such as Hewlett- 
Packard to extend the protocol by defining new requests that 
can oe interpreted by X servers in the same way as standard X 
requests For example the HP input extension provides a pro- 
tocol request named XHPSeiDeviceFocus This request allows a 
client program to choose which window should receive input 
from some input device other than the keyboard or mouse The 
request has the following format 



Length 


Value 


Meaning 


(bytes) 






1 


128 s Number* 255 


ID ot HP Input Extension 


i 


8 


XHPSeiDeviceFocus Request ID 


2 


5 


Request Length (in tour-byte 






words) 


■ 


0. 1. or a Window ID 


Focus-Window Parameter 


4 


Device Identifier 


Focus-Device Parameter 


4 


Timestamp Information 


Focus- Time Parameter 


1 


0,1. or2 


Reven-to-Wmdow Parameter 


3 


Unused Bytes 





The request begins with a number that identifies the extension 
that implements the request and distinguishes the request from 
core piotocol requests The next byte identifies the request within 
the extension The length Focus-Window. Focus- Time, and Reven-to- 
Wmdow parameters serve the same purpose as Ihey do for the 
XSetinputFocus request described above The Focus-Device param- 
eter identilies the input device for which the client program mak- 
ing Ihe request wishes to control the destination of the mpul 



are supported by I he: HP-UX operating system. 

■ Allow the choice of the core devices to be easily config- 
ured and provide reasonable defaults if no choice is 
made. 

For Starbase applications the following additional goals 
were established: 

■ Provide full functionality for Slarbase applications using 
input devices in an X window. 

■ Ensure that the design does not require source code 
changes in the Starbase application, except for Ihe pos- 
sible exception of the call to the gopen function which 
is used to open an input device. 

■ Allow multiple programs to access and share the same 
input devices simultaneously. 

HP-HIL Input Devices 

HP-HIL input devices are grouped into three general 
categories by the Starbase/Xll server. First, there are 
keyboards and keyboard-like devices such as all of the 
different HP language keyboards, the HP 9291 6A Bar Code 
Reader, and the HP 46086A 32-Button Box programmable 
function keys. These devices either generate keycode data, 
or as in the case of the barcode reader, generate USASCII 
data which can be translated to keycodes. The second group 
of input devices are those that generate absolute positional 
data as well as button information. These include graphics 
tablets and touchscreens. The existing devices of this type 
report absolute positions for two axes, and may report zero, 



one. three, or four buttons, The third group of input devices 
are those that generate relative motion data. These include 
two-button and three-button HP-HIL mice such as the HP 
46095 A 3-Button (quadrature) Mouse, the M1309A 
Trackball. Ihe HP 46085A Control Dial Module (nine-knob 
box), and the HP 46083 A Knob (one-knob box). The existing 
devices of this type may report two or three axes of motion 
and report zero. two. or three buttons. 

There are a few HP-HII. devices that are not easily 
categorized. For example, the HP 46084 A ID Module, 
which is used to prevent unauthorized software duplica- 
tion, does not generate any input, but occupies a position 
on Ihe HP-HIL. II currenlly cannot be accessed through the 
X server. A client program can access it directly, but not 
across a network. Audio extension modules, such as the 
IIP 46082A, do not occupy a position on the HP-HIL, but 
X functions exist to access the beeper contained in the 
module. 

Core Input Devices 

Up to seven input devices can be attached to one HP-HIL. 
There is no standard definition in X for determining which 
of those devices should be used as the pointer or the 
keyboard. In Starbase/Xll Merge, explicit specification of 
the core devices is done through a configuration file. The 
name of the configuration file is constructed using the dis- 
play number specified by the user when X is invoked. 
Because that number is under Ihe control of the user, mul- 
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tiple configuration files with different names can exist and 
can be used to specify different input devices as the core 
devices. When a device is chosen, it can be specified either 
by giving the name of its device file and its intended use, 
or by giving an ordinal position (first, second, etc.) and the 
type of device, along with its intended use. The position 
of the device is relative to other input devices of the same 
type on the HF-HIL. with the first device being the one 
closest to the computer. For example, a graphics tablet can 
be specified as the pointer device with a line in the config- 
uration file of the form dev <M2 pointer, or with a line of the 
form first tablet pointer. 

It is possible to specify explicitly that the server operate 
with no pointer device or no keyboard device, or both. In 
addition, the keyboard can be specified as the keyboard 
device and the pointer device. This feature is provided for 
working environments where it is not desirable to have a 
separate pointer device. If a keyboard is used as the pointer 
device, the user can specify in the X server configuration 
file which keys cause the pointer to move and the mag- 
nitude of movement. These keys are taken over by X and 
are not available for use by client programs. To prevent 
conflicts in the use of these keys between X and client 
programs, it is possible to specify that the keys should be 
used for pointer movement only if a specified set of the 
modifier keys (e.g.. left Shift, right Shift. CTRL, left Extend 
char and right Extend char) are pressed at the same time. 
The user can also specify which keys should be interpreted 
as buttons for the pointer device. 

Default choices for the core devices reflect the devices 
most commonly used as the default keyboard or pointer 
device. For example, if a keyboard is attached to the HP-HIL 
and can be opened by the X server, it is used as the keyboard 
device. If more than one keyboard is attached, the last one, 
that is, the one most distant from the computer on the 
HP-HIL, is used. If no keyboard can be opened by the server, 
the last key device, such as a barcode reader or 32-bulton 
module, is used. For the default core pointer device, if an 
HP-HIL mouse is attached to the HP-HIL. it is used as the 
pointer device. If no mouse can be opened by the server, 
the last device on the HP-HIL that can generate motion 
data is used. If no such device can be found, the keyboard 
is used as the pointer device. If the motion device chosen 
is one that can report more than two axes of motion, axes 
beyond the first two are ignored. 

Some additional functionality was provided for HP 9000 
Series 800 Computers. These machines are capable of sup- 
porting up to four HP-HIL loops, each of which can be 
associated with a set of input devices. Our goal for these 
machines was to provide maximum flexibility in specifying 
input devices while still providing reasonable defaults if 
no specification is made. The method chosen provides a 
default based on the display number specified when X is 
invoked. This display number is used to determine which 
configuration files are used in initializing the server. 

The user can specify an HP-UX path to be searched for 
all input devices or the path to be used for an individual 
input device. This functionality was implemented to allow 
the HP-HIL path to be explicitly chosen on HP 9000 Series 
800 computers. However, it also proved useful during proj- 
ect testing. A test tool that was written to simulate HP-HIL 



driver input used this feature to simulate input from vari- 
ous input devices. The result was greater flexibility in test- 
ing various combinations of hardware. See the article on 
page 42 for more information about project testing, 

HP Input Extensions 

Although the core protocol of the X Window System is 
standard across all vendors. X was also designed to allow 
individual vendors to implement extensions to that pro- 
tocol. This allows vendors to add functions that are specific 
to their hardware or software requirements, or thai are not 
included in the core protocol. If these extensions are found 
to be useful for the general X community, a procedure 
exists to propose them as standards to be included in future 
releases of X. 

This was the method chosen to add support for HP-HIL 
devices within the X server. It provided a solution that met 
the needs of X clients, while also providingStarba.se drivers 
with information from input devices that could not be re- 
ported through the core X protocol. See the box on page 
39 for an example of X protocol and X extension format. 

There are two parts to most X extensions: library func- 
tions to invoke the protocol requests it defines, and a server 
portion to process the requests and implement the func- 
tions. The X protocol defines the format of requests in the 
X library. An input X extension is more complicated than 
other X extensions because it also involves the creation of 
new input events, code to generate the events within the 
server, a means to allow clients to ask to receive those 
events, and code to route the events to the appropriate 
clients. Unlike many extensions, input X extensions re- 
quire additions lo both the device independent and device 
dependent portions of the server. 

To provide functionality equivalent lo thai provided for 
the core devices, it was necessary to implemenl protocol 
requests that are analogous to core protocol requests and 
also allow the user to specify which device should be manip- 
ulated. These functions include the ability to select input 
events from a device, control the focus of that device, and 
"grab" (temporarily take exclusive control of) a device. 

Other necessary functions include those lhat allow a 
client to list all the input devices available to the X server, 
and functions to enable and disable those devices. Also, 
input events for this extension were defined so that more 
than two dimensions of motion data could be reported. 

Technical Issues and Trade-offs 

The major input extension implementation issue we en- 
countered was how to treat input devices other than the 
pointer that report motion data. The position of a typical 
pointer, such as a mouse, is tracked by the server and a 
cursor is echoed at that position on the display by the 
server. A keyboard takes its position from the pointer, and 
its focus is either explicitly set or is determined by the 
position of the pointer.* It was obvious that additional key 
devices should be treated like the keyboard, but it was not 
obvious how additional motion devices should be treated. 

The alternatives were either to treat all devices supported 
through the extension like the keyboard or to treat addi- 

•When a keyboard key is pressed, one of Ihe parameters returned lo Ihe application is a 
pomier (cursor} posilior 
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tional motion devices like the pointer. If they were all 
treated like the pointer, the server would have to track their 
position and echo a cursor for them, and not allow their 
focus to be explicitly set by the client. If they were absolute 
devices, their input would have to be scaled to the screen. 
If instead they were treated like the keyboard, the server 
would not have to track their position individually but 
would take it from the position of the pointer. The server 
would not echo a cursor for them, but would leave that up 
to clients and allow their focus to be explicitly set. To give 
clients maximum flexibility, it was decided to treat all 
devices supported through the extension like the keyboard. 

Input Devices and Starbase 

The Starbase library provides functions to open input 
devices and to receive two- or three-dimensional world- 
coordinate input. Several device drivers exist that allow 
Starbase to receive input from different devices or from 
the same device in different environments. In some of these 
environments, access to input devices has been exclusive, 
allowing only one program at a time to open and access a 
device. Shared devices for Starbase applications have been 
supported under previous HP window systems, but only 
for a pointer and a keyboard. Therefore, the major Starbase 
contribution to this project has been providing the ability 
for multiple programs to share all input devices. 

At first it was not known how to achieve the desired 
device sharing functionality. However, once it was deter- 
mined that an input extension would be provided, the basic 
approach was to provide device driver code that uses either 
core or X extension Xlib calls to obtain input from the 
requested devices. In this manner, the X server provides 
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shared access to all devices for both Starbase and X clients 
(see Fig. 1). The X server arbitrates the sharing of input 
devices between programs, and applies normal focus rules 
to Starbase and X programs. The new device driver code 
is similar to the existing Starbase HP-HII. driver code, dif- 
fering only in how it obtains input from a device. 

The syntax of the gopen request, which describes the input 
device to be opened, was enhanced to allow the specifica- 
tion of an input device and window combination, This 
allows the driver to make a request in the form expected 
by the X server to open that device and request input from 
it. Since many Starbase programs specify this information 
through HP-UX environment variables or program param- 
eters, they can take advantage of the enhanced syntax with- 
out changing the source code of the program. 

It was possible to access the core input devices through 
Starbase input requests in previous releases of X, and com- 
patibility has been maintained so that client programs can 
continue to access these devices as before. However, in 
previous releases of X. except for the keyboard and pointer, 
it was not possible to access input devices in a manner 
that would allow them to be shared among programs. Also, 
it was not possible to access them across a network. As a 
result of this project, programs can take full advantage of 
the window system and network, while continuing to use 
additional devices and access them for Starbase input. 

Direct Access to Input Devices 

Client programs can open and access input devices di- 
rectly that are not in use by the X server. This allows a 
program that was not written for a windowed environment 
to continue to work. However, only one instance of that 
program can be run at a time, thus preventing other X 
clients from using that device. Although a good feature for 
existing programs that do not require a windowed environ- 
ment, direct accessing of input devices is not a recom- 
mended practice for any newly written or ported programs. 

The core pointer and keyboard devices cannot be directly 
accessed by client programs, since the X server opens those 
devices. 

Conclusion 

The result of this project is that existing applications are 
supported, and an easy transition to a windowed environ- 
ment is provided for them. As shown by Fig. 1. programs 
have a number of optional ways to access the input devices. 
Exclusive access to input devices other than the core de- 
vices is supported, although not recommended for new 
clients. Shared access through X libraries is supported for 
both core and extension input devices. Shared access 
through Starbase input routines is supported for both core 
and extension input devices, and is provided in a way that 
minimizes changes to existing Starbase programs. 
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Sharing Testing Responsibilities 
in the Starbase/X1 1 Merge System 

The testing process for the Starbase/Xl 1 Merge software 
involved setting realizable quality goals, and using 
extensive test suites and test tools to measure and automate 
the process. 

by John M. Brown and Thomas J. Gilg 



WITH THE DEVELOPMENT OF the Slarbase/Xll 
Merge environment, new forms of testing had to 
be considered. Before the Starbase/Xl 1 Merge 
project, the X lest suites consisted of nearly 450 tests, and 
the Starbase test suite contained nearly 400 tests run across 
an average of 40 hardware configurations. The challenge 
was to make the appropriate modifications to this extensive 
set of tests to make them useful in the Starbase/Xll Merge 
environment. In areas where the existing test suites were 
inadequate, new test tools and tests had to be developed. 

Test and Quality Goals 

The combination of existing and new test suites needed 
to ensure adequate code coverage. Adequate code coverage 
in this context means exercising all procedural interfaces 
(i.e.. X and Starbase library calls), and the in-depth testing 
of each procedure. An HP software tool known as the 
branch flow analyzer (BFA) was used to measure code 
coverage. Code quality was measured in terms of defect 
densities and defect arrival rates. The project quality goals 
were stated in terms of acceptable defect densities (defects 
per 10 KNCSS*) for each class of defect severity. Further- 
more, defect arrival rates (defects per 1 .000 test hours) were 
closely monitored throughout the project, and objectives 
were set to achieve specific diminishing arrival rates at 
project checkpoints. 

Strategy 

Existing test technologies for X and Starbase were re- 
viewed for their suitability in testing the Starbase/Xll 
Merge system. In several cases, the existing technologies 
and their related test suites required no modifications. In 
other cases, weaknesses were identified and an effort was 
undertaken to enhance the remaining test tools and test 
suites. With nearly 850 pre-Starbase/Xl 1 Merge tests and 
several hundred megabytes worth of time-proven archives, 
the value of such an undertaking was obvious. Two test 
strategies were undertaken. First, new tests were developed 
that could be directly incorporated into the existing test 
suites. Second, for all the test scenarios not covered, new 
test tools and tests were developed. 

In all cases, a high priority was placed on the automation 
of tests. A best-case scenario was envisioned in which all 
the code changes, deletions, and additions developed in 

'Thousands ol noncomtnoni sou'ce statements 



one day would be tested overnight on all available re- 
sources, and a summary of the test results would be gener- 
ated automatically for inspection by the engineers the fol- 
lowing day. In addition to the testing effort, code reviews 
helped round out the quality assurance effort. A code re- 
view or code walkthrough was conducted for each new 
code module. Attendance included the code author, a mod- 
erator or code reader, and several reviewing engineers. 

Testing Measures 

To help guide the testing effort, several test and quality 
metrics were identified and used. These metrics include: 

■ Branch Flow Analyzer (BFA) Coverage. The branch flow 
analyzer provides a measure of how well all the code in 
the software under test is exercised (covered) during the 
testing effort. To use the BFA. the source file to be tested 
is run through a BFA preprocessor which places counters 
at all conditional statements and at the beginning of all 
procedures (see Fig la). The source file produced by the 
preprocessor is then compiled in a standard manner. 
When the program is run, the counters embedded in the 
code update an external disk-based data base, which can 
later be analyzed. Analysis of the BFA data base provides 
a summary of which procedures are called and a break- 
down for each procedure is given showing which condi- 
tional paths were executed, or more important, missed 
(see Fig. lb). The BFA tool identified unexercised sec- 
tions of code to be targeted when writing new tests. 

■ Defect Density. To measure the current product quality, 
the defect density described the expected number of se- 
verity weighted defects (critical, serious, low) per 10 
KNCSS. 

■ Defect Arrival Rate. As a way to sense trends in quality, 
the defect arrival rate described the number of defects 
found per 1000 hours of testing. 

■ Continuous Hours of Operation. A continuous hours of 
operation test was frequently executed to give an indica- 
tion of X server robustness, and to reveal any long-term 
execution sideeffects (e.g.. memory utilization growth). 

Engineer Test Suites 

The end users for the Starbase/Xll Merge product are 
software engineers who develop high-performance graphics 
applications running in windowed environments. With 



42 HEWLETT-PACKARD JOURNAL DECEMBER 1989 

© Copr. 1949-1998 Hewlett-Packard Co. 



eitem bfareeonj 0: 
extern bfanaport 0; 

main argc argv) 

int argc. 
char "argvfj. 

btarecorO (• main,!); 

char line |1000|. •». 

long llneno =» 0; 

bit exempt = o. number = 0; 

wtiKa(-*rgc 0 U C- -argv) |D| == -) 
bfarecord ("main' .2); 
for |i = argv [q+li *» != 0': *--> 

atartcord ("m«ln".3); 

twitch (•») 

cue V : 

bfarecord rmaln'.a): 

except = 1; 

break 
case n' : 

bfarecord ("maln".S) : 

number - 1 ; 

break; 
default: 

bfarecord ("main". 6) ; 

prlntf( "find: Illegal option Sen". - *); 

argc ■ 0; 

break: 

I 

I 

I 

If (argc '.= 1) | 

bfarecord ("maln".7); 

prlntt ("Usage: find -« -n pattern n"), 

else 

bfarecord ("maln".8) ; 
while getline (line, 1000) 0 
bfarecord ("main". 9 : 

I 

llneno+ + ; 

If ((Index line. 'argv| = 0) != except) 
blarecord ("main ,10); 

( 

H (number) ; 

bfarecord ("maln",tt); 
prlntf C'S1d:",1); 

) 

el»e bfeiecord("meln",12); 
prlntf ("%a .line); 

I 

elae bfanacord ("HMln".13) : 

I 

I 

I 

btareport ("-rreport") ; 

I 

M 

function e tines eosting • Clenches t of 

name Invoked Cancnes hit branches hit 



main : 


13 


9 


69 


Index 1« 


c 


5 


100 


get line 15 


A 




100 


Totals 


22 


18 


BJ 


A • preceding tie 
Mas not hit 


■i/*:t»on 


nane Indicates 


the 



3 functions in the program: 0 not hit 
(1,1 lOOX o' the functions *ere entered 

Fig. 1. (a) A BFA instrumented source lite The names ot the 
instrumented functions are highlighted The underlined lines 
ol code were inserted by the BFA preprocessor They are 
calls to the routine bfarecord which handles the accounting on 
the software being tested (b) The summary test report pro- 
vided by BFA alter the instrumented program is run 



this information we figured that some of the best test cases 
could be leveraged from the engineers developing the Star- 
base/Xll Merge code. Therefore, an effort was made to 
formalize the process that engineers naturally go through 
when trying a new version of the X server for the first time. 
All engineers were required to develop a short list describ- 
ing the types of tests they normally tried. When an integra- 
tion cycle approached, all engineers ran through their mini- 
suites and provided feedback. With little additional effort, 
such testing proved valuable. 

Starbase Test Suite 

The Starbase test suite has traditionally been used to 
perform testing of the Starbase graphics library on all of 
HP's supported graphics display devices and workstation 
configurations. The test suite consists of nearly 400 test 
programs, archive files of expected results, and various 
shell scripts and C programs that control test suite automa- 
tion. 

When a test program is run as part of an automated ses- 
sion, the resulting standard output and errors are compared 
against the expected result archives. In addition, represen- 
tations of the various graphics images that may have been 
generated by the test program are compared with the ar- 
chives. Specific differences between actual and expected 
results are noted in a test suite log file, and simple pass/fail 
information is placed in a summary file. 

Before the Starbase/Xl 1 Merge system, the test suite was 
used to test Starbase running only on a raw display device 
rather than in a windowed environment. With the advent 
of the Starbase/Xl 1 Merge system, there was a need to 
enhance our Starbase testing approach to include not only 
raw device testing, but also testing of Starbase in the X 
Winrlriw System environment. 

Starbase test programs in the Starbase XI 1 Merge envi- 
ronment take two basic forms: 

■ Window Naive. A window naive lest can run either in 
raw mode or in X, The test itself has no knowledge of 
X, and does not create X windows itself, but instead 
relies on an outside mechanism to create the windows 
and direct the test to those windows. 

■ Window Smart. A window smart test can only run in X. 
By definition, a window smart program makes X calls, 
and usually creates its own output windows. 

The enhancements made to the Starbase test suite had 
to be able to support both varieties of test programs. An 
additional goal of the changes was to leverage as much of 
the existing test suite as possible. To test window naive 
Starbase programs, the lest suite was modified so thai it 
could recreate various selected X window scenarios and 
then run test programs in each scenario. Since window 
naive programs can be run on a raw display or in an X 
environment, we were able to use a set of the existing test 
programs in these scenarios. Of course, now archives of 
expected results had to be created for each scenario. 

To cover window smart testing, an additional X window 
scenario was used in the test suite. Also, since none of the 
existing test suite programs contained both Starbase and 
X library calls, a set of new test programs had to be written 
to test this new functionality adequately. Areas of particu- 
lar testingattcntion included text fonts, cursors and echoes. 



DECEMBER 1989 MEWLEn PACKARD JOURNAl 43 



© Copr. 1949-1998 Hewlett-Packard Co. 



color map manipulation, backing store, double buffering, 
and /-buffering. 

Once the changes to the lesl suite were in place, the suite 
was run nightly in a test center stocked with a complete 
set of graphics display devices and workstation configura- 
tions. An additional set of tools was developed to gather 
and report test results automatically from each configura- 
tion on a daily basis. This was done even during the latter 
part of the Starbase/Xll Merge project implementation 
phase and it enabled developers to track the quality of their 
code as it was being completed. During the testing and 
release phases, the nightly test suite results helped ensure 
continuing improvement in code quality and stability, 

X Test Consortium Test Suites 

Through HP's affiliation with the X Test Consortium, 
several X test suites were acquired. The Digital Equipment 
Corporation's X test suite (nearly 350 tests) tests each call 
available in the Xlib library. The tests themselves come in 
two categories: good-only tests or centerline tests which 
just test for expected functionality. Validate and error tests 
expand on the centerline tests by checking for robustness 
using invalid parameters and erroneous functionality. 

The Sequent Computer Corporation, which is a member 
of the X Test Consortium, provided an X test suite that 
consists of nearly 125 tests that exercise the server at the 
X protocol level. The tests themselves do not use Xlib, but 
instead contain custom buffering routines to send X pro- 
tocol requests to and receive replies from the server. The 
object of these tests is to see how well the X server handles 
malformed protocol packets not normally generated 
through the X library calls. 

Early in the testing effort, the decision was made to make 
the X Test Consortium suites more manageable by control- 
ling them with HP's scaffold automation tool. 1 The scaffold 
provided the framework to manage the large body of tests, 
and also provided some input and output archiving. With 
the scaffold in place, the test suites were run nightly by 
an HP-UX cron script on all unoccupied workstations used 
by the Starbase/Xll Merge development team. 

HP-HIL and Input Extension Test Suite 

With the addition of several input extensions to the X 
server, a new input extension lest suite had to be developed. 
Previous input testing tools proved to be inadequate for 
three reasons: 

■ HP-HIL (HP Human Interface Loop) activity was usually 
captured after some processing of the HP-HIL activity- 
had already occurred. 

■ Previous test tools required that the code under test be 
modified to accommodate the test mechanism. 

■ Previous test tools could only handle keyboard and 
mouse activity, thereby excluding the new HP input ex- 
tensions to the server. 

The HP-HIL simulator, which was leveraged from an 
existing HP Windows/9000 test tool, allows multiple HP- 
HIL devices to be simulated and tested at once, including 
the new input extensions. The HP-HIL simulator operates 
in record/playback modes. The record mode requires the 
HP-HIL devices, the simulator, and a tester to run the test 
and use the HP-HIL devices. When it is recording, the 



simulator captures all HP-HIL activity and puts it into a 
file. In playback mode, the siniulalur uses the file captured 
during the record mode in place of the real HP-HIL devices. 
The tester only needs to start the test program in playback 
mode. All of the HP-HIL data, regardless of its source, is 
sent to the server. 

The HP-HIL simulator is installed by creating a pty 
(pseudo tty) in the tmp directory for each input device on 
the HP-HIL loop. This sets up a communication path be- 
tween the ptys and the real HP-HIL devices. To ensure that 
the X server will use the ptys in tmp, an appropriate entry 
is made in the server's Xndevices file to change each device's 
path from dev to Imp. The Xndevices file is used by the server 
to determine its input device locations. 

When a recording test session is started and the server 
tries to open what it thinks is an HP-HIL device, it is con- 
nected to a pty and the HP-HIL simulator is triggered to 
open the real HP-HIL device. Once this is done, the HP-HIL 
simulator, transparent to the test program, passes all HP- 
HIL device activity back and forth while saving all HP-HIL 
activity along with timing data into a file. The timing data 
ensures that realistic playback is provided. Pig. 2a shows 
the setup for test recording. 

For HP-HIL playback, the file that was saved during re- 
cording is simply read by the simulator, and the appropriate 
HP-HIL activity is generated in the same time sequence it 
was recorded and fed into the pty. During the playback 
sessions the real HP-HIL devices do not have to be present 
on the HP-HIL loop. This facility allows suites recorded 
using the HP-HIL simulator to run on any machine without 
concern for the presence of HP-HU. devices — which are 
sometimes hard to find. The setup for playback is shown 
in Fig. 2b. 

The HP-HIL simulator was used to test the server input 
extensions, and was then incorporated into the test scaf- 
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fold. The simulator was also used by another group to 
simulate foreign versions of the HP-HIL keyboard to test 
native language support (NLS) functionality 

GRM Test Suite 

The graphics resource manager [GRM] is composed of a 
daemon process and a client interface library. The suite of 
tests that was developed for the GRM system is partitioned 
according to the various functional components of the sys- 
tem. A test module was developed for each of the following 
functional categories: 

■ Client Server Protocol. The serial data stream between 
the GRM client and the GRM daemon. 

■ Object Allocation (including semaphores |. The mainte- 
nance of all display hardware resource allocations. 

■ Offscreen Memory Management. The allocation and 
deallocation of three-dimensional blocks of offscreen 
memory. 

■ Shared Memory Management. The creation, allocation, 
and deallocation of chunks of shared memory. 

■ Sequence Control. The maintenance of request se- 
quences for multiple processes. 

■ Listing of Objects. The wild-card matching and listing 
of all GRM objects. 

With the exception of the protocol test module, all of 
these test modules tested ihe operation of the GRM daemon 
through the Standard GRM interface library. For the pro- 
tocol test module, some library routines were replaced with 
altered copies of the original library routines to achieve 
the desired lest procedure. 

Although the GRM daemon is designed to operate with 
multiple clients, the tests were designed to have exclusive 
use of Ihe GRM. If another GRM client process was delecled 
by Ihe lesl process, the test would identify the error and 
exit. Since only one GRM daemon will run on a single host 
al any particular lime. Ihe lest environment had to be free 
of any graphics applications that used Slarbase or Ihe X 
server. 

XDI Test Harness 

The X driver interface, or XDI, lias about four dozen entry 
points in Ihe device dependent portion of the X server. 
The X driver interface provides an interface between a 
translation module Sod the low-level X display drivers that 
perform the actual display control and rendering opera- 
tions on the display hardware. The translation module is 
responsible for translating requests from the device inde- 
pendent portion of the X server into a form suitable for the 
X display drivers. This architecture allowed Independent 
development by HP engineers in two different organiza- 
tions and locations, and provided a platform for code shar- 
ing. The X server code was done al HP's Corvallis Informa- 
tion System Organization, and Ihe display drivers (for X 
and Starbase) were done at HP's Graphic Technology Divi- 
sion. The article on page 6 describes the Starbase/Xll 
Merge X server and the XDI. and Fig. 2 on page 9 shows 
Ihe X server architecture. 

With the significant advantages of this newly defined 
interface, there came corresponding new testing demands, 
because high-quality, well-tested X displav drivers had lo 
be delivered at regular intervals, and these drivers had lo 



be developed whether or not any server code was a vai lable. 

While much of the underlying driver code was shared 
by the Starbase driver code, the X driver interface was 
tailored to the needs of the X server. The differences be- 
tween the Starbase driver interface and XDI were sufficient 
to prohibit direct use of Ihe Starbase test suite. Since Ihe 
test suite could not be directly used, other approaches were 
explored that would meet our testing needs and leverage 
as much of the existing test suite technology as possible. 

To provide a tool for debugging and automated testing, 
the XDI test harness was developed . The harness provides: 

■ A user interface for each XDI entry point 

■ A means for importing and manipulating Ihe associated 
data structures 

■ Support for a subset of C programming language com- 
mands. 

What makes the harness an unusual testing tool is the 
way in which it acts as an interpreter that receives input 
commands either interactively or from text script files. 

The XDI test harness offers several advantages over more 
traditional testing approaches that involve compiling vari- 
ous test programs and then linking each of them with Ihe 
code under test. The harness needs to be linked only once 
with Ihe code under test, and since the harness is interpret- 
er-based, any number of test programs can be run without 
the need to link each one. The harness also makes test 
programs easier to write and modify because il provides a 
convenient interface to the XDI entry points and the ability 
to manipulate data structures. Finally, disk space is ion- 
served because only the harness and not the numerous test 
programs need to be linked with the large driver libraries. 

As a result of these advantages, the XDI test harness 
proved to be a useful tool for XDI code development and 
debugging, in addition, with relatively minor changes to 
Ihe Slarbase tesl suite tools. Ihe XDI test harness was inte- 
grated directly into the test suite. An extensive set of new 
harness tesl programs was developed lo lest all the types 
of graphics display devices supported by Ihe Slarba.se/Xl 1 
Merge system. Once the tools and tesl programs were in 
place, this new XDI tesl suite was run nightly in Ihe lest 
center. 

Interactive Testing 

While il was desired to automate as many tests as possi- 
ble, not all server activities could be automated. Further- 
more, B measure of randomness not provided by Ihe auto- 
mated tests needed to be added. Areas especially suited 
lor this type of testing included object manipulations with 
the X cursor (e.g.. moving a window), screen changes when 
running in a stacked screens mode, rnulliserver environ- 
ment, and Starbase echoes (cursors operated by Starbase) 
in X. Usually, interactive testing allowed a wider range of 
scenarios lo be tried. When certain scenarios were iden- 
tified as productive, an attempt was made lo automate 
them. 

Conclusion 

With approximately 500 KNCSS between X and Starbase. 
and over 80 different hardware configurations, testing the 
Starbase/X1 1 Merge system proved to be very challenging. 
Available lest tools and lest suiles provided the bulk of 
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our automated tests, while the branch flow analyzer cover- 
age led to the development of new test tools and many new 
tests. During the latter half of the Starbase/Xll Merge proj- 
ect, we realized there was a need for more user-interactive 
tests. While automated tests are indispensable, we found 
that a great many interesting and important defects can be 
uncovered with the randomness provided by user-interac- 
tive testing. 
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expertise to user interface managemen! systems 
He was par! ol Ihe Starbase/X 1 1 Meige learn who 
designed and tested the Graphics Resouice Man 
ager Courtney is single and lives in CorvalUs. Ore- 
gon His oulside interests are b rdwalchmg. 
botany cross-country skiing, mountaineering, 
canoeing, and visiting Western deserts 
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Aulhor s biography appears elsewhere in this 
section 
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Author's biography appears elsewhere in Ihis 
seel ion 
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Michael H. Stroyan 

Mike Siroyan's special In- 
leresls are n computer 
graphics Afier receiving 
his BS degree in computer 
science from Colorado 
State Un.ve-sity in 1982. he 
pined HP anci worked on 
graphics lor the HP 9000 
Series 200 HP-UX operat- 
ng system and Starbase 
Alter loming the Slarbase/XI 1 Merge project team 
he concentrated primarily on the input and display 
dr vets Mike is a member ol the ACM SlGGRAPH 
He is single, lives in Fort Collins Colorado and en- 
joys skiing volleyball and swimming He was born 
m Lockpon New York 



John J. Lang 

John Lang nas worked on 
^tS^ Starbase projects since he 

^PP(^ |omed HP m 1985 - 

W 4t>^| contributed primarily to the 

> ™ TurboSRX drivers and the 

SRX driver ol the Starbase; 
X 1 1 Merge system. Born m 
Lompoc. California, he re- 
ceived h>s BS degree n 
wildlife biology m 1982and 
his MS m compuler science in 1 985, both from Col- 
orado State University. John is single, enioys ski- 
ing, and coaches a coed HP soccer team m a 
cltywide league m Fort Collins. Colorado, where he 
lives He is a member ol the ACM 



Jeff R. Boyton 

- Jeff Boyton is a graphics 
-3Nlb engineer responsible for 

j* 1 m Ihe X Window backing 
■ f^y siore and for the pixmap 
^"1 (memory) portion of the X 
Window System A 1986 
graduate ol the Michigan 
Technological University 
j with a BS degree In com- 
I puter science, he |omed HP 
the same year He now has engineering responsi- 
bility for the TurboSRX driver and administrates de- 
fect tracking tor Starbase Away from work, he is 
a Ireqoent panicipanl m volleyball tournaments 
Jeff is married and lives in Fort Collins, Colorado. 
He is a member of the ACM 





Sankar L. Chakrabarti 




Sankar Chakrabarti re- 
ceived degrees m chemis- 
try at Calcutta University 
and Kalyani Universities, 
ana a PhD degree in 
chemistry and molecular 
biology from the Tata Insti- 
tute of Fundamental Re- 
searcn in Bombay He also 
pursued basic research in 



molecular biology at Harvard Medical School and 
the University of Oregon A major career change 
brought him to Intel Corporation and later to HP's 
Corvallis Division in 1981 Since then, he has 
worked on the X 1 0 Window System and the Integral 
PC before joining the team that developed the 
backing store portion of the Slardase/Xii Merge 
system Born m West Bengal, India. Sankar lives 
wifh his wile and Iwo children in Corvallis, Oregon, 
where he helps coacn his son's soccer team He 
has an MS degree in computer science from Ore- 
gon State University (1985) 



Jens R. Owen 

Asamemoerotthetechm- 
^HHftL cal s,a " oi HPs Graphics 

4K«Y Owen worked on the de- 
'^^^^^^Mtf Sl 9 n °' raster lonls for the 

^^^^m eel. Born in Gaester. 

Denmark, he grew up >n 
^^^^^^ Denver and receive 

BS degree from Colorado 
State University m 1 986 He |Oined HP alter gradu- 
ation Jens is newly married and lives in Fori Collins, 
Colorado His hoobies include snowboarding, vol- 
leyball, and mountain biking 



John A. Waitz 

A graduate of Colorado 
^^pB^^. State University with a BA 
jj^ ^ degree (1980) and MSCS 
I gfW degree (1982) John Waitz 

™ ~~jM nas worked on various Star- 

' base assignments since 

joining HP in 1 983. His con- 
tribution to Ihe Starbase/ 
X1 1 Merge project was in 
the development of Star- 
base and X Window display drivers and driver 
utilities. John was born in Philadelphia. Pennsyl- 
vania, bul grew up in Boulder. Colorado He Is 
single and I ives in Fort Collins Colorado He piays 
the piano, smgs in a community jazzchoir. and en- 
joys tennis, running, and bicycling He is a member 
of the ACM and the IEEE 



schooling to their three daughters, lor which they 
design the curriculum He also plays the piano and 
accompanies his family on bicycle tours 



Peter R. Robinson 

Peter Robinson was a de- 
sign engineer for the XDI in- 
terface and direct 
hardware access (DHA) 
extensions to the Starbase' 
I X11 Mergeserver His ear- 
lier assignments include 
I work on the HP Integral PC 
I and HP Portable Plus proj- 
ects Born m Lancaster, 
California he received his BS degree in computer 
science from Ihe University of California at Los 
Angeles m 1 975 and his MS degree Ifl electrical en- 
gineering Irom Stanford University in 1978 He 
joined HP in 1980 at Corvallis, Oregon, where he 
now resides Peter and his wife provide home 
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section 



Keith A. Marchington 

I Development of global dis- 
play controls and server 
operational modes lor the 
X Window System server 
are Keith Marchmglon's 
primary contributions to the 
Starbase/XH Merge proj- 
ect He has also been a 
product marketing en- 
I gineer on the X10 Window 
system and a product support engineer lor various 
Corvallis products. Born m Poniand. Oregon, he 
lives m Corvallis. Oregon, where he joined HP in 
1 979 He has BS degrees in compuler science and 
mathematical sciences Irom Oregon Slate Univer- 
sity (1981). Keilh is a single parent who en|oys 
sports, reading, and a variety of activities with his 
live-year-old son 



Steven P. Hiebert 

I A lormer U.S. Air Force stall 
| sergeant. Steve Hiebert 
was bom in Poniand, Ore- 
gon He received his BS 
degree m mathematics 
from Portland State Univer- 
| sity in 1976 He joined HP 
m 198' and worked on the 
compilers and compiler 
utilities for the HP Integral 
PC. Before thai . he worked on Pascal compiler im- 
plementation and suppon at Electro Scienlilic In- 
dustries and was the department manager tor sys- 
tems programming at TimeShare Corporation On 
the Starbase/X11 Merge project, Steve's respon- 
sibilities included the cursors, locks, rendering 
state, comomed mode, and stacked-screens 
mode ol the X Window System He is single and 
lives in Corvallis. Oregon He >s a member of the 
IEEE and the ACM 
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George M. Sachs 

As an R&D engineer on the 
Starbase/X1 1 Merge proj- 
ect. George Sachs was re- 
sponsible for the suppon of 
input devices by the X Win- 
dow server and for propos- 
ing and implementing stan- 
dardized support lor addi- 
tional input devices 
through the X Consortium. 
He also has been a software quality engineer on 
various other projects Before coming to HP in 
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Ian A. Elliott 

Atte' rece>vmg b>s BS (1984) ana MS (1987) de- 
grees from !he University of Ulan and serving as 
assistant director ot the University s Department ol 
Boengmeenng Surlace Analysis Laboratory. Ian 
Elliott joined HP and began working on Ihe Star- 
base 1X1,1 Merge protect His most recent coninbu- 
tions nave been to the X Window System . Starbase 
device driver and general architecture He 
authored a ' 983 article about surlace analysis lor 
the Journal ot Electron Spectroscopy and Related 
Phenomena, lan. wno was born m Detrort. Michi- 
gan, lives in Fort Collins Colorado with his wile and 
new daughter He spends most ol his free time fix- 
ing up iheir recently purchased nome 
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John M. Brown 

London Kentucky was 
home to John Brown until 
he completed his BSEE de- 
gree m 1980 at the Univer- 
sity ol Kentucky His spe- 
cially of high-performance 
3D graphics rendering is 
.^^^ I the result ol several years 
spent developing software 
lor sonar systems at IBM 
and hardwa'e and soltware graphics architectures 
at General Electric He joined HP m 1 988 and as- 
sumed responsibility for qualify assurance and 
testing for Ihe Slarbase/Xi 1 Merge proiecl John 
is married, has fwo children, and lives >n Ft Collins. 
Colorado He enjoys running, snow-sknng. and 
bicycling 



Thomas J. Gilg 

I Thomas Gilg earned his BS 
0986) and MS (1988) de- 
grees in computer science 
from Montana State Univer- 
sily. where he was a 
] graduate research assis- 
tant working on the remote 
1 electronic animal data sys- 
J tern (READS) proiecl at 
/ ■ GeoResearch. Inc He has 
worked on the Starbase.' X 1 1 Merge projecl since 
pining HP m 1988 particularly in testing lor Ihe 
merge server, concentrating on mixed-mode tests 
Thomas is an avid lisherman, through his member 
ship in the Northwest Association of Steelheaders 
he is active in the preservation of fishing habitats 
He also enpys back-country traveling, hiking, and 
cross-country skiing He was born in Casper. 
Wyoming, is single, and lives in Corvallis. Oregon. 
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iVk. ( system When he ptneo HP 

1hi978 nehao completed 
some ten years of system 
programming experience 
at Control Data Corporation Since then, he has 
ne*3 positions as a senior p'&gramming analyst for 
manufacturing applications and as a field systems 
engineer tor commercial installations He wrote and 
presented a paper at the 1984 HP 3000 Users 
Group (now Interex) semiannual conference de- 
scnbing his research into the rules of Image David 
was born in ShoweH Maryland, and lives in San 
Jose California His hobDies are woodworking and 
reading He and his wife enpy traveling to Hawaii 
every year 

Michael B. Kalstein 

Bom in Elkins Park. 
Pennsylvania. Mike Kal- 
i stein graduated In 1 976 
with a B A degree from Tem- 
ple University He came to 
. pi hp after receiving his MS 

^| : omputer SO- 

nversityol 
^^T^K^J Illinois >n 1979 He soon 
^^^Hl pined the Santa 
California, sales ollice to become a systems 
engineer responsible tor presale support, training 
and field soltware coordination In later assign- 
ments as on-line support manager and systems 
escalation engineer at HP's Commercial Systems 
Division, he directed technical support lor MPE 
systems and administered Ihe engineering stall 
His contributions to Ihe HP Source Reader proiecl 
included developing pari ol the access and tillering 
programs and coordmahng Ihe production ol some 
ol ihe CDs Mike and his wife are expecting their 
second child m December His leisure interests 
include playing piano and guitar song-wnling. ten- 
nis, bicycling, walking, and home improvements 
He lives in Campbell. Calilomia 

Stephen J. Pearce 

When he pined HP in 1 973, 
Steve Pearce had com- 
pleted a lour year in- 
ternship in electronic R&D 
at tne British Ministry ol 
Delense Smce then, his 
4 assignments nave in- 
™ eluded positions as a 
bench engineer on micro- 
wave equipment, as a cus- 
tomer engineer on computer systems, and as both 
a technical support engineer and a response 
center engineer on the HP 3000 Computer On the 
Source Reader propel, he was the programmer for 
the access program Sieve holds Ihe equivalent ol 
a BSc degree in electronic engmeenng from Ihe 
Worcester Technical College He was born in 
Bromsgrove. Worcestershire, has two children, 
and lives in San Jose, Calilomia His favorite leisure 
activities are astronomy and listening to music. 
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meHP8116APuiseFuncl>onGeneraIot Hejcned 
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1989 in Paris Bom in Stuttgart, he is marned. has 
three boys and lives in Herrenberg Baden- 
Wuntemberg In his off-hours, he enpys rebuilding 
an old home, woodworking, and photography 
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Larry J. Thayer 

Larry Thayer workeo on the 
HP 9000 Series 500 Com- 
puter and the SRX graphics 
system scan-conversion 
chip before assuming re- 
sponsibility for the HP 
mr TurboSRX pixel processor 
I . Presently, he is designing 

VLSI for a luture graphics 
hardware product He 
graduated Irom Ohio Slate University with a BSEE 
degree in 1 978 and an MS degree in 1 979 He is 
coauthor ol a 1984 HP Journal amcle about Ihe HP 
9000 Series 500 Computer and a I986SIGGRAPH 
paper describing the SRX scan-conversion chip 
A native ol Lancaster Ohio. Larry lives in Fort Col- 
lins. Colorado wilh his wile and two children His 
hobbies include basketball, soltbail. and church 
and lamily activities 
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1987 issue ol Electronic Design Dave enioys re- 
pairing automobiles and televisions and reading 
science fiction He is married, has a daughter and 
makes his home In Fort Collins. Colorado 
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A Compiled Source Access System Using 
CD-ROM and Personal Computers 

HP Source Reader is in use in virtually every HP support 
facility around the world, giving local support engineers fast 
access to complete source code listings for MPE, the 
HP 3000 Computer operating system. 

by B. David Cathell, Michael B. Kalstein, and Stephen J. Pearce 



HP SOURCE READER IS A SYSTEM for accessing 
compiled source code stored on compact disk read- 
only memory (CD-ROM) for purposes of system de- 
bugging, The source code is stored in a proprietary format 
lh.it Optimizes retrieval by tlie access program running on 
.111 I IP Vei.tr. i Computer. 

HP Source Reader facilitates quick and efficient debug- 
ging of HP 3000 Computer systems by allowing the user 
to display source code at any point within a specified pro- 
cedure or segment. The user can then quickly scroll the 
display or jump to any other location with precise control. 
Relevant Information can be "popped" onto the screen in 
seconds. This includes identifier definitions, reference 
materials, and the assembly code corresponding to each 
source line. The program also provides many useful aux- 
iliary functions including searching, printing, logging, and 
a comprehensive set of customization options. A context 
sensitive help facility eliminates the need to consult writ- 
ten documentation. 

Unlike other source browsing systems. HP Source Reader 
was written by and for engineers who debug HP 300(1 Com- 
puters. The user interface is designed to be familiar to 
support engineers who may not be knowledgeable about 
personal computers. The program prompts users for infor- 
mation in the same format as other tools they use. In addi- 
tion, to make the program easy to use, HP Source Reader 
takes full advantage of the personal computer user inter- 
face, including keyboard, mouse, pop-up windows, and 
menus. 

To our knowledge. HP Source Reader is the first system 
in the industry that combines the convenience of one-step 
source retrieval with the power of the CD-ROM and per- 
sonal computer (PC) technologies. 

HP 3000 Debugging— Before 

HP 3000 Computers are debugged, for the most part, by 
analyzing dumps of the computer's memory. When a sys- 
tem fails, the operator dumps the memory to magnetic tape 
and then restarts the computer. The tape is forwarded to 
HP where it is formatted and analyzed by an engineer. The 
engineer must examine source code while reading the 
dump, comparing the failed system to what the source code 
indicates should happen when the system is running nor- 
mally. The engineer is constantly alternating between the 
source code and the dump throughout the analysis of the 



problem. 

Historically, memory dumps have been printed on paper. 
This worked fine when the HP 3000 contained less main 
memory, but this practice has gradually become untenable 
with the advent of larger and larger systems. Therefore, 
interactive tools have been developed that allow a dump 
to be analyzed in an on-line mode. Over time, these interac- 
tive tools have been enhanced to the point where they are 
now powerful on-line tools that allow engineers to locate 
and format specific information in a memory dump easily. 
However, as these tools have matured, no parallel progress 
has occurred allowing efficient on-line examination of 
source code. Engineers have continued to depend on 
printed listings stored in a shared library area. 

Pig. 1 shows the complex manual process that must be 
followed to locate specific source code in a listing from 
information presented in the memory dump. It should be 
apparent that this is exceptionally tedious. 

Project History 

In 1986, we began to rethink the strategy for the use of 
workstations within our organization (IIP Commercial Sys- 
tems Support). It seemed apparent that real productivity 
gains could be made by engineers, managers, and support 
personnel through the use of readily available PCs and 
software. 

At the same time, we recognized that it was becoming 
feasible to marry the PC to the emerging technology of 
optical media. This marriage could provide a platform for 
an engineer to access the massive amount of information 
required for system level support of MPE. Ihe HP 3000 
operating system. 

It became apparent that much of the lime spent in analyz- 
ing system failures was not bringing expertise to bear on 
the problem. Engineers were spending too much time in 
overhead activities — walking to a library, finding listings, 
and engaging in the long tedious process of source location. 
We felt this strictly mechanical work could and should be 
automated. 

We refined our ideas sufficiently to produce a PC-based 
demo version of a program. This gave us the opportunity 
to evaluate the user interface with feedback from engineers 
who would be users of the actual program, if and when it 
was produced. It also gave us a method to communicate 
our vision to software developers who had experience in 
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the CD-ROM industry but who did not necessarily have 
knowledge of our particular activities in supporting the HP 
3000 

Unfortunately, when we surveyed what was available in 
an attempt to save development effort, all we found were 
natural-language-based keyword indexing data base en- 
gines. These are not a viable solution because there is a 
substantial difference between natural language and com- 
puter language For instance, the word "ball" has a small 
number of meanings which are reasonably consistent from 
document to document. However, the variable "x" may- 
have many different meanings depending on where it ap- 
pears in the source code. 

Eventually, we concluded that there were no existing 
solutions that we could leverage to meet our needs — we 
would have to develop a prototype. This first prototype 
was the proof of the validity of the concept. It had enough 
positive aspects to justify the resources to rewrite and then 
extend the programs. 

The main body of effort is now complete and the gener- 
ation of HP Source Reader CD-ROMs is becoming a routine 
manufacturing effort. The only remaining tasks involve 



small utility programs to automate some partially manual 
processes. In addition, we plan to continue to expand the 
functionality of the access program as good ideas are 
suggested and as time permits their implementation. 

Project Goals 

At the beginning of the project our overriding objective 
was to improve the efficiency of HP 3000 system debugging. 
To achieve this, we established the following goals: 

■ Elimination of paper listings to save time, space, and 
mundane labor. 

■ Full use of emerging technology to make engineers' time 
as productive as possible 

■ Ease of use to minimize learning time and errors. 

■ Minimal impact on organizations that supply source 
code to avoid the need to reformat source code or modify 
procedures. 

■ Cost-effectiveness to make it easy for support organiza- 
tions to justify the expense required. 

CD-ROMs and PCs 

CD-ROM is a logical choice for the paperless environ- 




Library 



Fig. 1. The traditional method ol 
source code location 
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merit. A CD-ROM, a 4VS inch plastic disk, can hold the 
equivalent of a 35-foot stack of paper listings. This is suf- 
ficient capacity to contain an entire release of the HP 3000 
operating system and its supporting software. The disks 
are inexpensive enough that each engineer can have a set. 
Unlike paper, optical media are machine readable, allowing 
for a wide variety of automated access techniques. Because 
the CD-ROM is read-only, it cannot be overwritten; it al- 
ways has the integrity it had when it was manufactured. 

The HP Vectra PC Is an excellent system for implement- 
ing a high-technology, ergonomic access program. It has 
many features that make it comfortable for users — a mouse, 
a lull-color display, and I he ability to pop up and remove 
windows and menus as necessary. Since the Vectra is a 
personal computer, each engineer has private use of the 
system and its performance is not impacted by other users 
competing for resources. In addition, the Vectra supports 
a vast array of commercially available software and hard- 
ware products. Some of these products provide mechanisms 
for switching quickly between the source code and the 
dump, capturing parts of both in an integrated document. 
The Vectra is widely available within HP and is already 
in use in many offices that would need to use HP Source 
Reader. 

Fig. 2 shows how HP Source Reader is used to accomplish 
the task shown in Fig. 1. These two diagrams clearly show 
the reduction in manual effort brought about by the access 
program. 

HP Source Reader 

HP Source Reader consists of two main parts. The first 
is (he data preparation system, which is used to generate 
the CD-ROMs from the compiled source code as it is pro- 
duced by the lab. The second is the access program that 
runs on the Vectra, which is used to locate arid display the 
source code stored on the CD-ROM. 

CD-ROMs are generated whenever a new version of the 
MPE V or MPE XL operating system is about to be released. 
Each disk contains all the modules associated with a given 
version. Fig. 3 shows the process flow used to convert the 
data from its original form (in the lab) to its final form (on 
the CD-ROM). Raw source code is maintained in the lab, 
then compiled with the output listing files submitted for 
inclusion on the CD-ROM. The compiler listings are pro- 
cessed in a series of steps to produce a magnetic tape set. 
The tapes are sent to a mastering facility, which manufac- 
tures the disks. 

Structure of the CD-ROM 

The CD-ROM has exactly the same physical structure as 
the now familiar audio CD. The only real difference be- 
tween the two is the meaning of the information recorded 
on the optical media, which represents computer data in 
the case of the CD-ROM and digitized music on the audio 
CD. Data is recorded as a series of pits positioned in a 
continuous spiral (beginning at the center of the disk). The 
pits are read as ones and zeros when illuminated by a laser 
source. The bits are evenly spaced, requiring the drive to 
vary the rate of rotation to maintain a constant linear ve- 
locity. Additional bits are used to provide a high level of 
error correction. 



Additional structure is imposed to make it possible to 
use the CD as a random-access device. A standard layout 
of the disk directories and files known as the High Sierra 
standard was proposed and widely accepted within the 
industry. Microsoft Corporation was active in the definition 
of the standard and quickly produced an intermediate level 
driver that makes all High Sierra CD-ROMs look like very 
large standard DOS discs (albeit read-only). CD-ROMs re- 
corded using this standard have approximately 550.000.000 
bytes of available disk space for data and directories. The 
wide acceptance of this standard and the availability of 
the Microsoft CD-ROM extensions made it possible for our 
project to develop our access program using the normal 
DOS file functions. 

From the very beginning of the project, it was evident 
to us that the organization of the many files that would be 
on the CD-ROM was of paramount importance. A poor 
choice would have resulted in terrible performance. The 
resulting design makes extensive use of DOS subdirectories 
to group modules in a pattern logically similar to that of 
MPE. Fig. 4 shows the directory structure of the CD-ROM. 
The root directory contains only a file describing the con- 
tents of the CD-ROM. The second level subdirectories are 
of three types — one for system libraries, one for programs, 
and one for the reference documents. 

The system library subdirectory contains only a file list- 
ing all the entry points and segments for that library. The 
modules themselves are located in subdirectories below 
the system library directory. Each module subdirectory 
contains a set of files containing the compressed source 
code, identifiers, cross reference, procedure map, and op- 
tionally, the object code for that module. 

The program subdirectories contain a set of files contain- 
ing the compressed source code, identifiers, cross refer- 
ence, procedure map. and optionally, the object code for 
thai program. 

The document subdirectory contains a set of files con- 
taining the compressed text, page list, table of contents. 



Interactive Dump 
Analyzer Screen 



Filium « 

FILES VS1A 
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Source Reader 
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Fig. 2. HP Source Reader method of source code location 
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and index for each document. 

In addition to providing good performance, this structure 
has proved to be quite robust — only small extensions were 
required to include the changes brought about by MPE XL. 
Originally, we had only one system library subdirectory, 
and now we have three. In addition, a new directory type 
was defined for include files (these are files that are incor- 
porated into the source code of multiple modules to provide 
common definitions, etc.). In the case of MPE XL. a set of 
files containing the compressed source code, identifiers, 
and cross reference for the large include file DWORLD re- 
sides in that directory. Thus this shared information is 
recorded only once, greatly reducing the amount of disk 
space required. 



Library 1 






I < 

an 





Program Z 



ModutaC 



Fig. 4. CD-ROM directory Structure 



Filters 

In the compact disc industry, a filter is a program that 
reads some form of data and reformats it for use on a CD- 
ROM. The files on the CD-ROM are designed and organized 
to facilitate rapid retrieval of the desired information. The 
application designer can lake advantage of the fact that 
optical media can be read but not written. Thus it is desir- 
able to do as much processing as possible during the data 
preparation phase. This should result in less processing 
and, presumably, faster data retrieval by the access and 
display programs. 

For the HP Source Reader project to succeed, we had to 
minimize any additional effort that might be required of 
other organizations. In our case, thai meant that the input 
dala for the filler program would have to lie the same com- 
piler-generated listing files that were already supplied for 
each release of MPE. These are exactly the same files that 
we previously printed and archived in our library. 



The initial prototype filter was for SPL. the primary lan- 
guage used in MPE V. The result was tantalizing in thai it 
gave us a glimpse of the tool that we had envisioned. 

We learned from this prototype when we began the de- 
sign of thi! filter for Pascal/XL [the primary language used 
in MPE XL). The major goal was to automate the processing 
of the huge number of listing files. The logical solution 
was a data base that would contain enough information 
about each operating system module to make the need for 
human intervention minimal. Thus the filter could locate 
files on the system to be filtered, determine which filter 
was to be used, and record the results of the filtering in 
the data base. This goal also dictated that filtering be done 
on a more powerful computer system than a PC, and an 
HP 3000 Series 70 was chosen. 

A secondary goal was that the overall environment and 
the program structure be suitable for extending the filter 
for other programming languages. Proper design of the data 
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base would easily allow extending Ihe environment. To 
facilitate extending the filter program itself, we chose a 
three-pass philosophy. 

The first puss parses each input record and determines 
what part of the listing it represents. It then reformats infor- 
mation to be retained and writes it to the appropriate inter- 
mediate file. The second pass performs certain cleanup 
tasks such as removing duplicate information regarding 
identifiers. The final pass generates the target files for Ihe 
CD-ROM. 

Although the first implementation using this three-pass 
philosophy was for Pascal/XL. we found that more than 
95% of the code was retained when we extended the pro- 
gram to handle Pascal/3000 (the MPE V version of Pascal). 
The second and third passes were only minimally changed. 
Perhaps this result is not very surprising in the case of 
such closely related Pascal compilers. However, we found 
that more than 90% was retained when we implemented 
the SPL version of Ihe filter. With these three filters, we 
can now process 99% of the modules for both MPE V and 
MPE XL. 

Most of the processing is done by the filters. However. 
Ihere is a need to accommodate certain complex modules 
that consist of multiple compilation units that may even 
be written in different languages. To keep the process as 
simple as possible, we filter each submodule and later em- 
ploy a merge utility, which we also developed. This pro- 
gram uses the data base to determine which submodules 
need to be merged. The source, identifier, cross reference, 
and optional object files are retained bul the procedure 
map files are combined. Each procedure entry in the 
merged map file indicates which submodule contains it. 

Writing the filters was not a Irivial task. We encountered 
numerous difficulties. The biggest challenge was posed by 
inaccuracies in the compiled output. The filters detected 
numerous cases of cross references that didn't exist or were 
on pages other than what the compiler reported. The Pascal 
compilers support long identifier names but truncate them 
in many places. 

Additional challenges were provided by programmers. 
Some use NOLIST compiler directives to turn off listing 
output Others use the DEFINE construct in SPL to improve 
readability and shorten the code. Still others use different 
cross-reference programs whose formats are different from 
the ones for which the fillers were written. 

Premastering and Mastering 

Premastering is the process of converting files from stan- 
dard DOS format to High Sierra format. The files oulput 
by Ihe filters are standard DOS file images, while compact 
discs are recorded according to the High Sierra standard. 
Premastering changes the structure, not the content, of the 
files. The conversion is done on a CD Publisher system 
manufactured by Meridian Data Systems. The output of 
the CD Publisher is a set of master tapes, which are then 
sent to a compact disk mastering facility. 

The mastering vendor takes Ihe tapes and creates a CD- 
ROM master with Ihe same data structure. This will be 
used to press CD-ROMs by a process identical to that used 
for audio CDs. The finished CDs are sent back to HP for 
packaging and distribution. 



Access Program Design Philosophy 

As mentioned above, a major goal for this project was to 
make the access program easy to use. This was especially 
important because most of the engineers who use it are not 
knowledgeable about personal computers. Therefore, we 
designed the screen layout with the major commands per- 
manently displayed on the second line. Above that line is 
an area that identifies Ihe current procedure. II is also used 
for dialog for commands that require it. The remainder of 
Ihe screen is used for displaying source code. 

Commands are invoked by pointing at them with the 
mouse. For systems without a mouse, Ihe command can 
be selected by pressing the slash key (/) followed by the 
first letter of Ihe command. When a command is selected, 
a menu drops down from the command line listing Ihe 
subcommands. The user can point to Ihe desired subcom- 
mand with the mouse or type the first letter of the subcom- 
mand. Prompts for additional information can be displayed 
on the top line or in dialog boxes if more room is needed. 

Many of Ihe commands require information such as the 
name of a procedure or a variable. We recognized that, 
while the program is in use. this information is probably 
already displayed on the screen. Therefore, we permil the 
user lo move the alpha cursor by pointing at a screen posi- 
tion with the mouse, Ihen selecting the command. When 
Ihe user is prompted for the name of a procedure or variable, 
the access program automatically displays Ihe identifier 
above the cursor as the default value. 

Another design decision was the extensive use of win- 
dows — temporary boxes that overlay the main screen and 
contain information gathered from some other place in the 
listing. For example, if Ihe user wants to know more aboul 
a variable used in the currently displayed code, the infor- 
mation is displayed in a window overlaying the top of the 
code area. Once Ihe user has finished with Ihe window it 
is removed and the code area is restored to its previous 
condition. 

Although HP Source Reader uses many windows, it is 
not a Microsoft* Windows application. At the time the 
project began. MS Windows was not an established prod- 
uct. There was little known about OS/2 and Presentation 
Manager. Therefore, we decided to implement the access 
program as a character-based DOS application capable of 
running in various environments including DOS, MS Win- 
dows, and Quarterdeck DesqView.'" 

However, we also decided to structure Ihe program in 
such a way that converting to MS Windows or Presentation 
Manager would be feasible without a complete rewrite. 
Thus, the program has a main loop, which checks for a 
user action (keystroke, mouse movement, or mouse button 
press). Control then passes to a routine based on the current 
internal state. That routine performs some action, possibly 
changes the internal state, and returns to the main loop. 

Another important attribute of Ihe access program is the 
speed of scrolling — we wanted it to be as fast as possible. 
Unfortunately, the access speed of the CD-ROM is only a 
bit faster than that of a flexible disk drive. Since most of 
our disk access is sequential, we implemented a buffering 
algorithm using buffers that are one sector long (2048 
bytes). A pool of buffers is allocated when the program 
initiates. The exact number depends on the amount of 
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memory available on the system. Buffers are linked in order 
from most recently used to least recently used. When a 
new one is needed, the least recently used buffer is cleared 
and reused. This results in faster access than simply reading 
the individual records one at a time. Furthermore, the dis- 
play of information already in the buffers is very rapid, 
since no I'O is required. 

HP Source Reader is written in Turbo Pascal from Bor- 
land International with extensive use of routines in Turbo 
Power Tools Plus from Blaise Computing. The program 
employs numerous overlays which are carefully organized 
to preclude the possibility of thrashing. 

Access Program Command Overview 

The HP Source Reader access program is designed to 
provide engineers with the most flexible interface possi- 
ble — one that provides commands that allow the required 
code to be located with minimum delay. The program was 
developed by engineers who would use it in day-to-day 
work, so the command structure chosen complements the 
data provided by current tools. 

The main commands and subcommands of HP Source 
Reader are as follows: 

GOTO 

This is probably the most important command in the access 
program. It allows the user to select the exact code to dis- 
play. Subcommands allow different types of access to the 
source. In MPE, each module is located either in a library 
or an application program. In MPE V/E and MPE XL com- 
patibility mode, procedures are grouped into segments. In 
MPE XL native mode, segmentation is not used. To provide 
a consistent user interface. HP Source Reader defines "native 
mode segment" to be interchangeable with "module." 
goto has six subcommands, 

SEGMENT. Allows the user to select a segment/module 
name to be used for the starting point for displaying source 
code. Optionally, the user can also provide an offset from 
that starting point. The user can limit the search domain 
to specific libraries to reduce search time. 
PROCEDURE. Identical to GOTO SEGMENT except that the 
user provides a procedure name as the starting point. 
ENTRY. Equivalent to GOTO PROCEDURE with an implicit 
offset to the main entry point of the procedure. This by- 
passes declarations and nested subroutines, procedures, 
and functions. 

CALL. Equivalent to GOTO ENTRY, plus the current module 
and location are saved in a logfile. allowing the user to 
return to this point at a later time. This mimics the call 
and return mechanism used by a computer. 
RETURN Allows the user to return to a place in the source 
code that was saved in the logfile as a result of an earlier 
GOTO CALL. 

APPLICATION Allows the user to select an application pro- 
gram to be displayed instead of a library module. 

IDENTIFY 

This command displays information regarding identifiers 
defined in the source code. Three subcommands select 
different information to display. 

VALUE. The identifier map information supplied by the 



compiler for the selected identifier is displayed in a win- 
dow. This includes type, class, and location or value. 
DEFINITION The source code containing the definition of 
an identifier is displayed in a scrollable window. 
LOCAL VARS. The identifier map information supplied by 
the compiler for all the local identifiers in the current pro- 
cedure is displayed in a scrollable window. 

SEARCH 

This command finds a specific item or pattern in the current 
module. Three subcommands determine the search method. 
Each can be done in a forward or backward direction. 
IDENTIFIERS Finds the next or previous occurrence of an 
identifier as supplied by thecompiler cross-reference table. 
TEXT. Searches forward or backward for text matching a pat- 
tern, which can include wildcard characters for increased 
flexibility. 

LEVEL Searches in the required direction for a specific 
block level. The block level is a function of the BEGIN-END 
statements in Pascal and SPL. Each BEGIN increments the 
level number, and each END decrements it. 

DISPLAY 

This command switches the display between code and sup- 
plementary information while retaining the previously dis- 
played information. Seven subcommands select what infor- 
mation to display. 

CODE. Returns to the source code display. 
PMAP. Displays the procedure map for the current module. 
This lists procedures with segment offsets, if applicable. 
REFERENCE. Displays the current page of the current refer- 
ence document. Useful documents such as internal specifi- 
cations are included on the CD-ROM. 
LIBRARY MODULE APPLICATION. Displays a list of procedures, 
module/segments, or applications whose names match a 
pattern 1 . 

STACK. Displays the current logfile CALL history. 
TOGGLE 

This command controls the state of three binary switches. 
ABSOLUTE-RELATIVE. Alters the way that code offsets are 
displayed. They can be ABSOLUTE (using the segment as a 
base) or RELATIVE (using the procedure as a base). 
HEXOCTAL Alters the radix of code offsets. 
SOURCE ONLY'INNERLIST Displays source code only or 
source code interspersed with the corresponding assembly 
instructions generated for each source line. 

PRINT 

This command prints information to a printer or file. There 
are subcommands to control what is printed. 

REFERENCE 

This command selects a specific document or a location 
In that document using the table of contents or index. 

CONFIGURE 

This command is used to customize the program by select- 
ing miscellaneous options for the access program to use. 
These include display colors, screen size, printer, function 
keys, and CD-ROM drive location. 
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HELP 

Context sensitive help text is provided to assist with any 
difficulty using the program. For example, if the user is 
being prompted for some input, the HELP command dis- 
plays text that explains the exact nature of the input re- 
quired. This is most useful for a novice user, but even 
experienced users may need assistance from time to time 
with infrequently used features. 

QUIT 

This command gracefully exits from the program. A special 
logfile entry is made, saving the current location. This al- 
lows the user to issue a GOTO RETURN command the next 
time the program is run to resume displaying the source 
code that was being displayed when HP Source Reader 
was last terminated. 

An Example 

Fig. 5 shows part of a typical HP Source Reader session. 
An engineer is trying to locate the source line that aborted 
the system. From the memory- dump the engineer has deter- 
mined that the code aborted in segment HARDRES at octal 
offset 16562. The engineer switches from the dump analysis 
tool to HP Source Reader. Fig. 5a shows the screen after 



the engineer selects t he GOTO SEGMENT command and types 
the segment name and offset. HP Source Reader locates the 
source code at that location, resulting in the screen shown 
in Fig. 5b. The cursor is positioned on the source line 
corresponding to the return point from the call to SUDDEN- 
DEATH — the engineer has found the call that aborted Ihe 
system. 

From the code, it is apparent to the engineer that SUDDEN- 
DEATH is called if CHECKLDEV determines that the value of 
the variable LDEV is invalid. The engineer then needs to 
examine LDEV in the dump to determine what value it 
contained when the check failed. The engineer uses the 
mouse to point to LDEV on the screen, then invokes the 
IDENTIFY VALUE command. Fig. 5c shows the screen for 
doing this. HP Source Reader locates the identifier map 
information for LDEV and displays it in a window as shown 
in Fig. 5d. The engineer now knows that LDEV is found at 
location Q-%14. and therefore switches back to the dump 
analysis tool and examines the value of LDEV found at that 
location in the memory dump. 

Conclusions 

HP Source Reader provides substantial increases in pro- 
ductivity based on our personal experience, feedback from 



56 HEWLETT-PACKAF1D JOURNAL DECEMBER 1989 



© Copr. 1949-1998 Hewlett-Packard Co. 



support engineers, and management analysis. The time that 
it takes an engineer to locate a specific source location has 
been reduced from several minutes to a few seconds. 
Further savings are achieved by direct access to supporting 
information such as identifier maps, assembly code, refer- 
ence materials, and other sources. Significant cost savings 
are achieved by the elimination of paper listings. These 
savings include computer time, consumable items, labor 
for printing and binding, and storage costs. 

HP Source Reader represents an important contribution 
to HP's commitment to customer satisfaction in support. 
Local support engineers now have fast access to complete 
source listings. Previously, maintaining such listings in 
every HP support office was not cost-effective. Today, mure 
problems are resolved by field support personnel. Custom- 
ers realize this as improved system availability. 

HP Source Reader is now in use in virtually every HP 
support office around the world. Engineers tell us it is 

Microsoft saUS registered trademan\ of Microsoft Corporation 



indispensable, and managers at all levels have gone out of 
their way to report that HP Source Reader has dramatically 
improved problem resolution time. 

HP Source Reader successfully combines new optical 
media technology with the ease of use and power of the 
PC. Designed with HP's traditional "next bench" develop- 
ment philosophy, it seems to be developing into the method 
of choice for MPE system support engineers who analyze 
memory dumps. 
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Correction 

In the letl column on page 99 ot the October 1989 issue, the 
words parallel" and "perpendicular" are transposed in equa- 
tions 3 and 4. Fig 2. and the associated text Fig 2a on page 
99 shows reflectivity R(0), not reflection coefficient r(») as stated 
(R(») = r*(0).) Fig. 2b shows R*(«). which is the fraction of light 
reflected after two reflections Also Brewster's angle (% is approx- 
imately 61 c instead of 59° as shown. 
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Transmission Line Effects in Testing 
High-Speed Devices with a High- 
Performance Test System 

The testing of high-speed, high-pin-count ICs that are not 
designed to drive transmission lines can be a problem, 
since the tester-to-device interconnection almost always 
acts like a transmission line. The HP 82000 IC Evaluation 
System uses a resistive divider technique to test CMOS and 
other high-speed devices accurately. 

by Rainer Plitschka 



TODAY'S STATE-OF-THE-ART DIGITAL ASICs (ap- 
plication-specific integrated circuits) are charac- 
terized by faster and faster clock rates and signal 
transition times. In testing these devices, delivering I lie 
test signals to the device under test (DUT) and precisely 
measuring the response of the- DUT can be a problem. To 
maintain signal fidelity, transmission line techniques have 
to be applied to the lester-to-DUT interconnection. 

This paper illustrates how this critical signal path is 
implemented in the HP 82000 IC Evaluation System to 
obtain high-precision timing and level measurements even 
for difficult-to-test CMOS devices. The HP 82000 offers a 
resistive divider arrangement that provides terminated 
transmission lines to the inputs and outputs of the DUT. 
This makes it possible to lest low-output-current devices 
up to their maximum operating frequencies. The HP 82000 
tester also offers good threshold accuracy, low minimum 
detectable signal amplitude, and system software that sup- 
ports adjustment of the compare thresholds according to 
the selected divide ratio. 

Whether an interconnection between the tester pin elec- 
tronics and the DUT should be considered a transmission 
line depends on the interconnection length and the tran- 
sition time of the driving circuitry. If 



t P d > V8 



(1) 



where t r is the equivalent transition time (0 to 100%) and 
t p d is the propagation delay (electrical length) of the inter- 
connection, then the interconnection has to be treated as 
a transmission line.' For delays less than 1/8 of the tran- 
sition time, the interconnection can be considered a 
lumped element. 

Table I shows propagation velocities of signals in differ- 
ent types of transmission lines. Using equation 1 for a typ- 
ical ECL output or a modern CMOS output with a 20-to-80% 
transition time of 1 ns. or 1.67 ns for 0 to 100%. and using 
Table I for signal velocities, we can compute a maximum 
interconnection length of 1.25 inch (3.1 cm) fora microstrip 



Table I 

Signal Velocity in Different Transmission Line Media 



Type 

Coax, air 
Coax, foam-filled 
Microstrip, FR4 



Velocity 

1 ft (30cm) perns 
8 in (20 cm) per ns 
Bin (15 cm) per ns 



line. There are no high-pin-count testers that even come 
close to such a short interconnection length between the 
pin electronics and the DUT. Therefore, a transmission line 
model must be used. 

Transmission Line Impedance 

Besides signal velocity, the line impedance Z| is a charac- 
teristic parameter of a transmission line. The value of Z| 
depends on the line type, geometric factors, and the elec- 
trical parameters of the materials used. Table II shows typ- 
ical values and tolerances. Note that Z| typically lies within 
a small range of values, and that quite high tolerances are 
usual. 

Table II 

Transmission Line Impedance Characteristics 



Line Type Range of Z, 

Coax, foam-filled 50 to 100ft 
Microstrip. FR4 30 to 120 ft 



Tolerance 

2 to 10% 
5 to 20% 



The choice of a value for Z, in a high-speed tester envi- 
ronment is influenced by three major factors. First, the 
outputs of ECL devices normally are designed to operate 
at Zj = 50ft. However, 25ft and 100ft outputs exist. 

Second, connecting a capacitance C to the end of a trans- 
mission line forms a low-pass filter. This occurs in a tester 
when a DUT with input capacitance C ln is connected to a 
driver. It also occurs at a comparator input, which has a 
lumped capacitance C| umped (see Fig. 1). The low-pass fil- 
ter's step response transition time t s (10% to 90%) is: 
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t, = 2.2t = 2.2{Z,C). 



(2) 



A signal with transition time t, at the input to the filter 
will be slowed down to a transition time of t,„ at the output: 



= Vt7^T 



(3) 



which adds additional delay at every point of the original 
transition. For the 50% point this delay is approximated 
by the factors shown in Table III. 

Table III 

Delay for the 50% Point of a Transition Caused by 
Low-Pass Filtering 

t,«t, t, = t, t,»t, 

Delay at 50% 0.7Z,C 0.9Z,C 1.0Z,C 

As a consequence, the impedance Z, should be as low 
as possible, because C (that is. C in or Q,,,,,,,^) is always 
nonzero. 

The third factor influencing the value of Z| is the required 
source current capability. To minimize it, Z| should be as 
low as possible. To generate a voltage step V, to propagale 
along the transmission line, the source has to provide cur- 
rent I, according to Ohm's law: 

K = V s /Z,. (4) 

This is true for both the tester's driver circuit and Ihe 
DUT. Proper design of the driver circuit will ensure suffi- 
cient drive current. However, some DUT outputs, espe- 
cially CMOS, cannot provide the current required over the 
entire range of Z\ values shown in Table II. 

As a resull of these considerations, a tester in which both 
accuracy and speed are important will use an impedance 
Z, of 5011. 

Termination Models 

To maintain pulse performance, a terminated signal dis- 
tribution system has to be used. Two methods of performing 
the termination are possible: parallel and series. 

Parallel termination uses a resistor R, = Z| at the end of 
the transmission line, as shown in Fig. 2. At time t = 0, a 
voltage step V„ is generated by the the source. The forward 
wave will see the line as a resistor R = Zi. and therefore 
V,(t = 0) = V„. At t = t,„i the wave has reached the end 
o( the transmission line, and because R, = Z ( , 

A version ol this paper was originally presented at the IEEE European Test Conference. 
Pans. 1989 
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DUT 
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Fig. 1. Our interconnection model showing tow-pass litters 
caused by capacitive loadings 



V*f.t = tpd) = V 0 , No further reflections occur. The current 
that must be provided by the source is I 0 = V^Zi. and it 
flows as long as V,(t) = V D . This model is applicable for 
ECL outputs. The resistor R, is connected to - 2V. 

The series termination method uses a resistor R, = Z, in 
series between the source and the transmission line, as 
shown in Fig. 3. At timet = 0. a voltage step V 0 is generated 
by the source. The forward wave will see Ihe Hue as a 
resistor R = Zj. Because of voltage splitting between R., 
and Zj, V 2 (l = 0) = VJ2. At t = 1,,^ the wave has reached 
the end of the transmission line, and because of reflection 
at the open end. V 3 (t = t pd ) = 2V 2 (t = 0) = V 0 . After 
t = 2tpj. the reflected wave will reach the source side, 
giving = 21^) = V 0 . No further reflections occur, 
since the source side is terminated. The current I„ to be 
provided by the source is I„ = V 0 ;2Z I for the time 2t p(l . 
This termination model is appropriate for a driver circuit 
in the tester. 

Unterminated Environment 

When connecting a source with R 011 , =± Z| to a transmis- 
sion line there is no matching element in the circuitry. 
This situation arises when a DUT output, such as CMOS 
or TTL, is connected to a tester channel in which the driver 
has been set to high impedance and a high-impedance com- 
parator is used. Fig. 4 shows the resulting waveforms for 
Rou, > Z[ and R, ml < Z,. As can be seen, the transmission 
line mismatch creates a series of pulses that reflect back 
and forth (ringing). The amplitude and number of steps 
depend on the magnitude of the mismatch, and the duration 
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Fig. 2. Parallel termination model 
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depends on the propagation delay of the line and the 
number of steps. 

Under these conditions, accurate timing and level mea- 
surements are not easy, 2 For repeatability of measurement 
results, the ringing should be completely settled before a 
measurement is made. Therefore, the device has to be tested 
at data rates far lower than maximum. Fig. 5 shows the 
relationship between the maximum possible test frequency 
and the electrical length of the interconnection for various 
degrees of mismatch (i.e.. different device impedances), 
assuming two different settling criteria. One of the two 
curves assumes that the waveform is allowed to settle 
within 10% of its final value before an opposite transition 
can be started. The other assumes 1%. 

Tester Parasitics 

The basic elements of a tester's pin electronics are a 
transmission line, a driver, and a comparator. There is nor- 
mally also an ac/dc switch for performing dc measure- 
ments. This switch, implemented using a relay, can cause 
problems. However, by proper selection of the relay type 
and careful design, the transmission line impedance can 
be maintained without significant parasitics. 

For stimulating the DlIT, the driver output signal is fed 
to the pin. Because of the input capacitance of the fixturing 
and the pin capacitance (C ln ). the driver transitions will 



be slowed. This causes a delay as discussed above. Equa- 
tions 2 and 3 and Table III can be used to calculate the 
delay. Also, because of input leakage currents flowing 
through the drivers source impedance (R = Z| = 5011], the 
driver levels will change. For example, for an ECI. device 
with = 500 //A typically, there will be a voltage drop 
of lj h Z| = 25 raV.* Further problems will not occur. 

Fur receiving DUT data, the comparator can be used in 
two different modes (Fig. 6): high-impedance (high-/.) and 
terminated (parallel). 

In the high-Z mode, the driver is switched to high imped- 
ance, resulting in a capacitance Clumped formed by the par- 
asitics of the amplifier's switched-off transistors. Assuming 
a value of 3 pF for the compare chip (C, ) and 20 pF for the 
driver (C,|). C| umpod = C c + C d = 23 pF and the resulting 
step response time for the comparator input voltage is 2 ns. 

In the terminated mode, the tester's driver is used lor 
termination, eliminating the capacitance C,|. Only the com- 
parator's input capacitance will limit the bandwidth, giving 
a step response time of 2.2(C,Z|)/2 = 165 ps. This value is 
equivalent to an analog input bandwidth of 2 GHz. 

Fig. 7 shows the step response as a shmoo plot. The 
stimulus was a pulse with a transition time of t, = 200 ps 
from an HP 81 31 A Pulse Generator. The measured value 
of the 10-to-90% transition is t m = 275 ps. The resulting 
intrinsic transition lime t, of the comparator is therefore: 
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Fig. 3. Series termination model. 



t, = Vt&-t? = 16 



o ps. 



'lit tn.s paper the subscripts o and I indicate output and input parameters respectively 
and the subscripts h and I indicate n.gh and low 'ogic levels respectively Subscripts s 
<J. and g indicate the source, drain, and gale, respectively. a tieid-e'tect transistor 
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Interfacing CMOS Devices 

CMOS devices are usually unable to drive transmission 
lines. The output impedance of CMOS devices does not 
match typical transmission line impedances, and static 
power dissipation, which occurs when driving a termi- 
nated transmission line, may damage a CMOS device. 

Fig. 8 shows the operating characteristics of a CMOS 
output buffer cell. 3 The specified dc parameters V ohmln at 
I uh and V olm „ at l ol are marked. 

The output resistance is not linear. For high V,,,. the cell 
acts as a current source. For low V ds , it is a voltage source 
with a low resistance. The large-signal output resistance 
R„ u , can be defined for either high or low output by: 

rv.u.. = V„Ai 

(5) 

R..mii = _ ^dsl^d 

where V,| s and I,, are corresponding values on the curves. 

The worst-case output resistance. R md ,i or R ma ,h. ' s de- 
fined when V,| s = V„|, mj „ or ^olmmi anc ^ = '<iii or '..I- N°' e 
that there are major differences between typical and worst- 
case resistances. The resistance also varies with the operat- 
ing temperature. 

A CMOS output connected to a capacitance C will per- 
form as shown in Fig. 9. Assume that the source FET is 
turned on at t = OwithV.^ = V,| d . The capacitor is charged 
with constant current, resulting in a linear ramping voltage. 
As Ihe capacitor voltage increases, V df ,and the output resis- 
tance decrease. This decreases current flow into the 
capacitor, which slows the voltage ramp. The resulting 
capacitor voltage waveform resembles an exponential 
curve. 

The performance of a CMOS output driving a resistive 
load R|, m(l connected to a voltage source V, ( is shown in 



Fig. 10 The output voltage can be obtained by drawing a 
line defined by V dl = V ioml and I d = V'| Md .'R| oad . The inter- 
sections with the FET characteristics define the output volt- 
age and current for source and sink operation. The transi- 
tion times depend only on the internal switching. Loading 
to the dc specifications can be obtained by using values 
for R| lvid and V Wi calculated as follows: 

Ricd = lV ohmjn - V ollnju ,)/(|I^,) + IU) 

16) 

if _ Vuhminilull ^'ulnmJluhl 

,oud ~ IUMU 

A worst-case device loaded to I„ h or L,, will have an output 
voltage of V tlhnllD or V 0 | max . respectively. A typical device 
will have an output voltage greater than V„ hmln or less than 
Voim.,,. respectively. 

CMOS Driving a Transmission Line 

Connection of a CMOS output directly to an open-ended 
transmission line is shown in Fig. 1 1. Assume that R,, ul > Z\ 
and a positive transition occurs at t = 0. At t = t pd the 
device output will correspond to the intersection of the 
FET's characteristic and the load line defined by V dh = V,| rf 
and I d = V,| d 'Z|. Calculating the device's output resistance 
using equation 5. the output waveform behavior can be 
predicted as discussed above for the unterminated environ- 
ment. Because of the nonlinear output resistance, slightly 
different waveforms may occur depending on the actual 
V, Ls and I,,. When R,, ul is less than Zj, the second step on 
the source side may be higher than V dd . If clamping diodes 
are included between the output and V dd , this reflection 
can be reduced and further reflections will be inverted. 

For CMOS outputs with R [1U1 < Z|. termination can be 
achieved by adding a resistor R s between Ihe output and 
the transmission line. The value should be: 



R„ — Zj — R 0 



10-- 




10 14 25 35 45 55 75 100 
Device Impedance (II) 



250 



500 



Fig. 5. Maximum test frequency in an unterminated environ- 
ment Any OUT switching time is assumed to be zero and 
any transition time is assumed to be zero 



This is the series termination model, which gives correct 
pulse performance. This method has been suggested in the 
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Fig. C. Operating modes ot the receiver path. 
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past. 4 However, its practical applicability is limited, be- 
cause the output resistance for positive and negative tran- 
sitions is generally not equal, and the output resistance 
changes from sample to sample and is not stable with tem- 
perature. 

The Resistive Divider Solution 

The resistive divider provides a solution to the problem 
of embedding a CMOS device in a transmission line envi- 
ronment. This technique is implemented in the HP 82000 
IC Evaluation System. 

The operating principle of the resistive divider is to apply 
a definable dc load to the DUT. Signal fidelity is maintained 
because the signal is fed into a parallel-terminated system: 
therefore, no reflections occur. 

Fig. 12 shows a schematic diagram of the resistive di- 
vider. The resistor R, is built into the tester. The resistor 
R s is selected by the user to give an appropriate divide 
ratio for the particular DUT. R s is installed on the DUT 
board, which interfaces the DUT to the tester and is differ- 
ent for each DUT. The user then tells the HP 82000 software 
what the divide ratio is. The termination voltage V, in Fig. 
12 is also selected by the user. 

Besides providing a terminated transmission line envi- 
ronment, the resistive divider puts only a very small capaci- 
tive load on the DUT (shown as C par in Fig. 12). A value 
as low as 2 pF can be obtained if R s is close to the DUT 
pin. This is possible using ceramic blade probes with 
printed resistors. For high-pin-count devices (up to 512 
pins), the tester's DUT board can be laid out with easily 
installable resistors, keeping parasitics below 10 pF. 

The length of transmission line between the DUT and 
the comparator does not affect the capacitive and resistive 
loading on the DUT. The termination is done by the tester's 
driver, which is part of the I/O channel. Therefore, the 
lumped capacitance that occurs if the driver is switched 
to high impedance is eliminated. This ensures a wide 
bandwidth for the compare path as discussed earlier. 

The DUT output levels detected will be reduced by the 
divide ratio. The resulting comparator input voltages can 
be calculated by: 
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Fig. 7. Shmoo plot of the step response at the tester input 
tor an input signal with t, = 200 ps. 



V = 

• I inn 



Vr.R, + v.R, 

R s + R, 



(7) 



where V„ is the actual high or low output voltage under 
the defined load. This equation can also be used for cal- 
culating the appropriate threshold setting. For ease of use. 
this calculation is embedded in the HP 82000 tester soft- 
ware, so that a user always thinks in terms of noncom- 
pressed signals. 

Resistive Divider Parameters 

The selectable parameters of the divider are R 5 and V,. 
There are several choices for defining the DUT load. Device 
loading according to dc specifications is normally the best 
choice. The DUT's maximum power consumption will 
never be exceeded and throughput is improved. If the at: 
test is performed with the specified dc loading, the need 
for further dc fanout measurements is eliminated. 

Dc loading specifications can be converted to resistive 
divider parameters using: 




0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 



Output Voltage V a „ (volts) Output Voltage V„ s (volts) 

Fig. 8. Output characteristics (l a vs VaJ of a CMOS output buffer, where l a is the dram current 
and is the drain-to-source voltage across the FET 
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Output Characteristics 
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Special ac loads are defined for timing measurements as 
shown in Fig. 13. These ac loads can be converted to resis- 
tive divider parameters by the Thevenin equations: 



Rs = 1/(1/Ri + 1/Rv) - R, 
V, = V dH R 2 /(R, + R 2 ). 



19) 



There are situations where the values of R,, and V, calcu- 
lated using equation 8 cannot be used. Changing the loading 
will change the output levels. The changed values can be 
obtained from the output characteristic curves. The actual 
values are defined by the intersection of the load line with 
the FET curve. The worst-case values can be obtained from 
the intersection of the load line and the worst-case output 
resistance line, as shown in Fig. 14. These modified levels 
can be calculated using: 



»ol- ~ (V|R max l + V op|R|oodK(Rmosl + Rlnatl) 

y = fli - v„ pl )/(R„,„i + 



(10] 



U' = (V, - v D 



phi 



(Rn 



+ Ri™, d ) 



where V op | and V oph are the low and high level open-circuit 
output voltages, and the values for worst-case output resis- 
tance are given by equation 5. 

Modified loading may result in higher power dissipation 
for one of the output levels. It is recommended that power 



Fig. 9. CMOS output driving a 
capacitive load 



consumption be checked using: 

Pdi - v*V 

Pdh = (v dd - v oh -)i u( ,\ 



(in 



Practical tests have shown no problems as long as the 
level change caused is less than 500 raV. 

To measure the reduced output signals resulting from 
the resistive divider, the comparator must be designed to 
detect small amplitudes. Two parameters affect this ability: 
comparator hysteresis and open-loop gain. The hysteresis 
is a positive feedback effect to ensure the comparator's 
stability. The open-loop gain is the comparator's amplify- 
ing factor for small signals, and is frequency dependent. 
Both parameters affect the finite voltage swing (overdrive) 
around the threshold that has to be applied to the com- 
parator input to obtain output switching (see Fig. 15). 

A high-performance comparator design will ensure that 
the necessary overdrive will be constant up to the 
maximum data rate. Smaller pulses can be detected as long 
as sufficient overdrive is applied. For detection of a single 
transit ion, the value for the flat section of Fig. 15 applies, 
and the limiting element is the input signal bandwidth. 

With a value of ±20 mV for the overdrive, and assuming 
a dc accuracy of ±10 mV, the signal's amplitude at the 
comparator input should be greater than 60 mV. A TTL- 
compatible output will generate a swing of at least 2V. 
Within a 50fl environment, this allows a maximum divide 
ratio tmm and a maximum value for R, of: 



R™» = 160011. 



(12) 



This means that devices having output currents 




Fig. 10. CMOS output driving a 
resistive load. 
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Fig. 1 1 . CMOS output driving an open-ended transmission line. 



l'..iJ + I'uil 3 2V/1600n = 1.2 mA at TTL output levels can 
be tested. 

DUT Power Dissipation 

At highor test frequencies, the resistive divider results 
in less DUT power consumption than a capacitive load, as 
shown in Fig. 16. 

I/O Pin Considerations 

The resistive divider is applicable to I/O pins, with some 
additional considerations. 

The tester's driver can generate a two-level signal. These 
levels should be set according to the DUT's input require- 
ments, that is, V, s V ilm „. V h 3 V lhmin . When receiving 
signals from the DUT, one of these levels has to be used 
for termination. This may mean that the calculated V, does 
not match the driving requirements. V, and R 5 should be 
set to: 



V, 



(13) 



R S = (V„hmln - VJ/U 

or V, =£ V ilm8X if I,,, « I oh 
R 5 = (V, - V lllmi , x )/I„,. 

The value of R 5 is modified to ensure that none of the 
output states will be loaded more than specified. This will 
occur if only V, is modified. Note that one level remains 
less loaded. If DUT power dissipation is not critical, the 
value of 1\ need not be modified. 

The device can be stimulated via the series resistor. 
CMOS normally has negligible input current, so no level 
errors occur. The input capacitance and R 5 form a low-pass 
filter, which limits the data rate and causes a delav at the 



Comparator 




V V 

Fig. 12. Resistive divider model 

50% point on the transition: 

Data rate = 2.3(Z, + R 5 )(C in + C poI ) 
Delay at 50% = 1,0(Z, + R s )(Ci„ + C par ). 



(14) 



Table IV shows values for the maximum data rate obtain- 
able and the corresponding delay for the 50% point. Also 
shown is the obtainable accuracy assuming a variation of 
1 pF for the capacitance. 

Table IV 

Low-Pass Filter Effects on Drive Signal 

(Driver transition time = 2 ns. C ln + C par = 10 pF.) 



R S ID) 
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Delay at 50% 


Delta Delay at 1 pF 
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1.5ns 


1.3 ns 
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2.5 ns 


2.2 ns 


225 ps 
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5.5 ns 


5.5 ns 
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11.5 ns 


1150 ps 




Fig. 13. Transformation of a de- 
sired toad to resistive divider pa- 
rameters. 
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CMOS Device Measurement Results 



HCMOS Example 

Rg 1 shows the signal Obtained ai Ihe HP 82000 tester com- 
parator input from an HCMOS output Switcnmg characteristics 
(at = 4 5V. T a = 25'C load capacitance C = 50 pF) are 
transition time =s 8 ns. propagation delay s 38 ns 

For companson, Fig 2 shows ihe signai obtained with the 
same output connected to an open-ended transmission line Sig- 
nificant influences are introduced by me transmission line envi- 
ronment 
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Fig. 1. Shmoo plot ol comparator input signal Irom an 
HCMOS output butter with 4-mA source/sink capability, 
loaded by a resistive divider with parameters ft s = 200 ohms . 
V, = 2.2V 
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Fig. 2. Shmoo plot of comparator input signal Irom an 
HCMOS output butter with 4-mA source/sink capability, 
loaded by an open-ended transmission line (tpg = 3 ns) with 
the comparator m high-Z mode (lumped capacitance = 23 
PF)- 



CMOS 14000 Family Example 

Fig 3 shows the signal obtained at the comparator input from 
a CMOS 14000 family output Switching characteristics (at 
Vga = 5V. T, = 25'C. C, = 50 pF) are transition time * 33 
ns + 1 35 ns/pF. propagation delay «80 ns t 0.9 ns/pF To 
show the comparator's sensitivity Ihe waveform is not back-cal- 
culated according to equation 7 of the accompanying article 
(that is. V^,,, is shown, not V Q ) Such a calculation would result 
in values of 3 97V for the high level and 1 .14V for Ihe low level 

For comparison, Fig 4 shows the signal obtained with the 
same output connected to an open-ended transmission line Be- 
cause of the slow transitions, the transmission line acts as capaci- 
tive loading 
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Fig. 3. CMOS MCI 4000 family output signal (V^ = 2 5V at 
Ion = 2.1 mA, Vo, = 0 4V at /„, = 0.44 mAj with resistive di- 
vider parameters ft, - JOOOn. V, = 2.5V. 
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Fig. 4. CMOS MCI 4000 family output signal (V^, = 2.5V at 
lon = 21 mA. V ol = 0 4V at /,„ = 0.44mA) with open-ended 
transmission line (t^ = 3 ns). comparator in high-Z mode 
(lumped capacitance = 23 pF, resulting in a load of 50 pF 
total) 
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Fig. 14. Loading resulting from modifying V. 
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Fig. 15. Comparator overdrive as a function of data rate 

DC Accuracy with Resistive Divider 

For ease of use, the tester's software takes care of the 
appropriate calculations of the user's comparator thresh- 
olds. This is done using equation 7. 
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Fig. 16. Device under test power dissipation as a function 
of frequency for capacitive loading at 50 pF and for a resistive 
divider with W-mW dc loading + 10-pF capacitance 



Thinking in terms of the noncompressed thresholds will 
affect the dc accuracy. There are four sources of error in 
setting the desired comparison threshold: 

■ Termination source error: dV, (mV| 

■ Comparator threshold error: dV„, (mV) 

■ Tolerance on R s : dR a (%) 

■ Tolerance on R,: dR, (%). 

The total accuracy for the desired threshold (dV p ) can 
be calculated as: 

dV p = rdV lh + (r-1)dV, + (l- l/rKVo.-V.UdR.-dRJ (15) 

where r is the divide factor: r = (R s + R,)/R,. 

Using 1% resistors and assuming 10-mV basic accuracy 
for the threshold and termination voltages, an accuracy 
dV p s 2r(10 mV) can be obtained. 

DC Measurement Capability 

When the loading on the DOT pin matches the dc specifi- 
cations, further fanout measurements are not necessary, 
but can be made anyway. The presence of R s will cause a 
voltage drop when a load current I, is forced at the output 
(Fig. 17). It is necessary to consider the drop when program- 




Fig. 17. Dc measurement path using the resistive divider 
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ming the compliance voltage of the tester's parametric mea- 
surement units (PMU). Since R, and the forced current are 
known, the actual output level can easily be calculated 
with sufficient accuracy. It is: 

For best results. 0.1% resistors are recommended for R,. 
Summary 

In the HP 82000 1C Evaluation System, the resistive di- 
vider method offers advantages in operating speed and 



measurement accuracy. The method has its restrictions and 
does not ensure testability of every DUT. 
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Standards. Wayne C. Goeke. Bonald L. Swerlein, Stephen B. 
Venzke, and Scott D. Stever 

losephson (unction Arrays 

A High-Stability Voltage Reference 

Design for High Throughput in a System Digital Multimeter, Gary 

A. Ceely and David / RtUtfcd 
Firmware Development System 
Custom UART Design 

High-Resolution Digitizing Techniques with an Integrating Digital 

Multimeter, David A. Czenkusch 
Time Interpolation 

Measurement of Capacitor Dissipation Factor Using Digitizing 
A Structural Approach to Software Defect Analysis. Takeshi 

Nakajo. Katsuhiko Sasabuchi, and Tadushi Akiyama 
Dissecting Software Failures, Robert B. Grady 
Defect Origins and Types 

Software Defect Prevention Using McCabe's Complexity Metric, 
William T. Ward 

The Cyclomatic Complexity Metric 

Object-Oriented Unit Testing, Steven P. Fiedler 

Validation and Further Application of Software Reliability 
Growth Models. Gregory A. Kruger 

Comparing Structured and Unstructured Methodologies in Firm- 
ware Development. William A. Fischer, /r. and /ames W. /osl 

An Object-Oriented Methodology (or Systems Analysis and Speci- 
fication. Barry D. Kurtz, Donna Ho, and Teresa A. Wall 

VXIbus: A New Interconnection Standard for Modular Instru- 
ments, Kenneth lessen 

VXIbus Product Development Tools. Kenneth lessen 
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|une 1989 

A Data Base (or Real-Time Applications and Environments. F'eyzi 
Fatehi. Cynthia Givens. tie T. lions. filfohaeJ H. Light. Ching- 
Chao Liu. and Michael / Wright 

New Miilrange Members o( the Hewlett-Packard Precision Ar- 
chitecture Computer Family, Thomas O. Meyer, Russell C. 
Brockmann. Jeffrey C. J forgis, John Keller, and Floyd E. Moore 

Double-Sided Surface Mount Process 

Data Compression in a Hall-Inch Reel-to-Reel Tape Drive. Mark 
I Bianchi. /e/fery /. Kato. and David /. Von Maren 

Maximizing Tape Capacity by Super-Blocking. David /. Van 
Maren, Mark /. Bianchi. and /e/fery /. Kalo 

High-Speed Lightwave Component Analysis. Roger IV. Wong, 
Paul Hernduy, Michael C. Hart, and Ceraldine A. Conrad 

OTDR versus OFDR 

Design and Operation Of High-Frequency Lightwave Sources and 
Receivers, Robert D. A/bin. Kent W. Leyde, Rollin F, Raivson. 
und Kenneth W. Shaughnessy 

High-Speed PIN Infrared Photodetectors for HP Lightwave Receivers 

Videoscope: A Noniritrusive Test Tool for Personal Computers. 
Myron R. Tuttle and Danny Low 

Videoscope Signature Analyzer Operation 

Neural Data Structures: Programming with Neurons, /. Barry 
ShackJeford 

A New 2D Simulation Model of Electromigration. Paul /. Marcoux. 
Paul P. Merchant. Vladimir Naroditsky. and Wulf I). Rehder 

August 198!) 

An Overview of the HP NewWave Environment, Ian /. Fuller 
An Object-Based User Interface for the HP NewWave Environ- 
ment, Peter S. Showman 
The NewWave Object Management Facility, /ohn A. Dysart 
The NewWave Office. Beatrice Lam, Scolt A. Hanson, and 
Anthony /. Day 

Agents and the HP NewWave Application Program Interface, 

Glenn H. Stearns 
Al Principles in the Design of the NewWave Agent and API 
An Extensible Agent Task Language. Barbara B. Packard and 

Charles H. Whelan 
A NewWave Task Language Example 

The HP NewWave Environment Help Facility, Vicky Spilman 
und Eugene /. Wong 

NewWave Computer-Based Training Development Facility. Law- 
rence A. Lynch-Freshner. R. Thomas Watson. Brian B. Egnn. and 
/ohn /. lencek 

Encapsulation of Applications in the NewWave Environment. 

Willinm M. Crow 
Mechanical Design of a New Quarter-Inch Cartridge Tape Drive. 

Andrew D. Topham 
Reliability Assessment of a Quarter-Inch Cartridge Tape Drive. 

Dnvid Gills 

Use of Structured Methods for Real-Time Peripheral Firmware. 
Paul F. Bartlelt. Paul F. Robinson, Tracey A. Hains, and Mark /. 
Simms 

Product Development Using Object-Oriented Software Technol- 
ogy, Thomas F. Kraemer 
Objective-C Coding Example 
Object-Oriented Life Cycles 

October 1989 

40 Years of Chronicling Technical Achievement. Charles L. Leuth 
A Modular Family of High-Performance Signal Generators. 

Michael D. Mc.N'amee and David L. Piatt 
Firmware Development for Modular Instrumentation. Kerwin D. 

Konago. Mark A. Stambaugh. and Brian D. Watkins 



RF Signal Generator Single-Loop Frequency Synthesis. Phase 
Noise Reduction, and Frequency Modulation. Brud E. Andersen 
and Earl C. Herleikson 

Fractional-N Synthesis Module 

Delay Line Discriminators and Frequency-Locked Loops 

Design Considerations in a Fast Hopping Voltage-Controlled Oscil- 
lator. Burton L. Mc/unkin and David M. Hoover 

High-Spectral-Purity Frequency Synthesis in a Microwave Signal 
Generator, James B. Summers and Douglas R. Snook 

Microwave Signal Generator Output System Design, Steve R. 
Fried. Keith L. Fries, and /ohn M. Sims 

"Packageless" Microcircuits 

Design of a High-Performance Pulse Modulation System, Douglas 
R. Snook and G. Stephen Curtis 

Reducing Radiated Emissions in the Performance Signal Genera- 
tor Family. Lorry R. Wright and Donald T. Borowski 

Processing and Passivation Techniques for Fabrication of High- 
Speed InP'InCaAs/lnP Mesa Photodetectors. Susan R. Sloan 

Providing Programmers with a Driver Debug Technique, Eve M. 
Tanner 

HP-UX Object Module Structure 
Identifying Useful HP-UX Debug Records 

Solder |oint Inspection Using Laser Doppler Vibrometry. 

Catherine A. Keely 
Laser Doppler Vibrometry 

A Model for HP-UX Shared Libraries Using Shared Memory' on 
HP Precision Architecture Computers, Anaslasio M. Martelli 

User-Centered Application Definition: A Methodology and Case 
Study, Lucy M. Berlin 

Interviewing Techniques 

Storyboarding Techniques 

Partially Reflective Light Guides for Optoelectronics Applica- 
tions. Carolyn F. /ones 

December 1989 

System Design for Compatibility of a High-Performance Graphics 
Library and the X Window System. Kenneth H. Bronstein, David 

/. Sweelser. and William fl. Voder 
The Starbase Graphics Package 
The X Window System 

Managing and Sharing Display Objects in the Starbase/Xl 1 Merge 
System, lames R. Andreas, Robert C. Cline. and Courtney- 
Loom is 

Sharing Access to Display Resources in the Slarbase/Xll Merge 
System, /e/f R. Boylon, Snnkar L Chakrabarti. Steven P. Hiebert. 

/ohn /. Lang, /ens H. Owen. Keith A. Marchington. Peter R. 

Robinson. Michael II. Stroyon. and /ohn A. Wailz 
Sharing Overlay and Image Planes in the Starbase/Xl 1 Merge Sys- 
tem. Steven P. Hiebert. /ohn /. Lang, and Keith A. Marchington 
Sharing Input Devices in the Starbase/Xl 1 Merge System. Ian A. 

Elliot and George M. Sachs 
X Input Protocol and X Input Extensions 

Sharing Testing Responsibilities in the Starbase/Xl 1 Merge Sys- 
tem, /ohn M. Brown and Thomas /. Gilg 

A Compiled Source Access System Using CD-ROM and Personal 
Computers, B. David Cofhell, Michael B. Kalslein, and Stephen 
/, Pearce 

Transmission Line Effects in Testing High-Speed Devices with a 

High-Performance Test System. Rainer Plitschka 
CMOS Device Measurement Results 

Custom VLSI in the 3D Graphics Pipeline. Larry /, Thayer 
Global Illumination Modeling Using Radiosity. David A. Burgoon 
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PART 2: Subject Index 



Subject Page Month 

A 

Ac voltage measurements, 
digital tS'Apr. 

Adaptive subdivision 86 Dec 

ADC. 16-to-28-bit 8/Apr. 

Agent 32 Aug. 

Agile signal generator 14/Oct. 

Air jet. lead inspection 81 /Oct. 

AtC loop 34.48.49/Oct. 

Algorithm, data compression 26/June 

Algorithm, electromigration 

simulation BO/June 

Algorithm, hemicube 81/Dec. 

Algorithm, mullislope runup 10/Apr. 

Algorithm, routing 48/Feb. 

Algorithm, subsampled ac 17/Apr. 

Algorithm, substructuring 86/Dec. 

Amplifier, GaAs 41/Oct. 

Amplifier, power 34.48/Oct. 

Amplitude modulation 59/Feb. 

Analyzer, frequency and lime 
interval 6/F'eb. 

Analyzer, lightwave component . 35/June 

Animation object 54/Aug. 

Anniversary, 40 years 6/Oct. 

Antenna, lunnd dipole 62/Oct. 

Anti-aliasing filters 67/F'eb. 

Aperture, ADC 14,41/Apr. 

Application definition 90/Oct. 

Application program interface 
(API] 34/Aug. 

Application-specific encapsula- 
tion 63/Aug, 

Architecture, voice and data 

network 43/Feb. 

Arming 9/Feb. 

Audit testing, tape drive 77/Aug. 

B 

Backing store 30/Dec. 

Bandwidth measurements, laser . 41/|une 

Behavior specifications 88/Apr. 

Blocking, tape drive 32/June 

c 

Cabinet RFI design 6U/Ocl. 

Calibration, electrooptical 40.45/)une 

Calibration firmware 24/Oct. 

Calibration, two-source 22/Apr. 

Capacitor dissipation factor 46/Apr. 

Capstan motor 75/Aug. 

CBT display object 52/Aug. 

CBT sample lesson 49/Aug. 

CD-ROM. source i:ode 50/Dec. 

Class 70/Apr.,91/Aug. 

Clip list 11,23/Dec. 

CMOS IC testing 61/Dec. 

Codewords, data compression .... 26/June 

Color map 11/Doc. 

Color map type 35/Dec. 



Combined mode 11. 34 Dec. 

Combined mode clipping 37/Dec. 

Compaction, tape 26.33/June 

Comparator hybrid 26'Teb. 

Complexity metric 64.66.85'Apr. 

Compound data objects 13/Aug. 

Computer, midrange HP Precision 

Architecture 18/June 

Computer-based training 48/Aug. 

Concept diagram 88/Apr. 

Container objects 13.24/Aug. 

Context diagrams BO/Aug. 

Context switching 57,'Aug. 

Continuous measurement 

technique 7/Feb. 

Controller, floating-point 21/June 

Concurrency 16/June. 96/Aug. 

Converter. A-to-D. 16-lo-28-bit 8 'Apr. 

Coprocessor, floating-point 21/June 

Core alignment 72/Aug. 

Core input devices 39/Dec. 

Crack growth, thin-film 82/June 

Create process 26/Aug. 

Current flow simulation 81/June 

Cyclomatic complexity 

metric 64.66.85/Apr. 

D 

Dark current 69/Oct. 

Data base backup 16/|une 

Data base data structures 9/|une 

Data base performance 15/|une 

Data base schema 15/June 

Data base tables 9/June 

Data compression, tape drive 26/)une 

Data flow diagrams 80/Aug. 

Data link layer 45/Feb. 

Data pointer 87/Oct. 

Data structures, neural 69/|une 

Dc measurements, calibration 24/Apr. 

Dc offset hybrid 25/Feb. 

Debug technique, driver 76/Oct. 

Decompression, data 28/|une 

Definition, application 90/Oct. 

Delay line 30.35/Oct. 

Deviations (frequency, time. 

phase) 30/Feb. 

DFT test, ADC 40/Apr. 

Diagnostic firmware 25/Oct, 

Dictionary, data compression 26/|une 

Dielectric passivation 72/Oct. 

Dielectrics, reflectivity 99/Oct. 

Differential linearity, ADC 22/Apr. 

Digital signature analysis 62/June 

Digital synthesis 53/Feb. 

Digital waveform synthesizer 

IC 53. 5 7/Feb. 

Digitized FM 32/Oct. 

Digitizing, multimeter 39/Apr. 



Direct hardware access (DHA 111 .22 Dec. 

Discriminator, delay line 30.32'Oct. 

Dissipation factor measure- 
ments 46/Apr. 

Dithering 77,'Dec. 

Divided output section 42/Oct. 

Divider. GaAs 40/Oct. 

Doppler vibrometry. laser 82/Oct. 

DOS programs service 58'Aug. 

Double-sided surface mount 

process 23/June 

Drawable 11/Dec. 

Driver debugging 76/Oct. 

Dual-slope ADC 8/Apr. 

DVVSIC 53.57/Feb. 

Dynamic range, lightwave 

measurements 50/June 



Effective bits 39/Apr. 

Eight queens problem 73/June 

Electrical-to-optical device 

measurements 36/June 

Electromigration simulation 

model 79/June 

Electrophotography, erase bar 98'Oct. 

EMI. signal generator 59/Oct. 

Encapsulation 57.89/Aug. 

Equilibrium, neural network 71/June 

Erase bar, LED 98/Oct. 

Errors, digital ac 18/Apr. 

Errors, ratio measurements 22/Apr. 

Extensible task language 35.38/Aug. 



Faceless instruments 94/Aug. 

Factor, super-blocking 

advantage 34/June 

Failure, thin metal lines 82/|une 

FET models 56/Oct. 

Fiber optic component analysis .. 35/)une 

File locking and concurrency 16/June 

Filler, dielectric 99/Oct. 

Fillers. CD-ROM 53/Dec. 

Firmware design 13/Feb. 

Firmware design, multimeter 31/Apr. 

Firmware design, synthesizer 70/Feb. 

Firmware, signal generator 20/Oct. 

Floating output amplifier 69/Feb. 

Floating-point coprocessor 21/June 

Flow control 46/Feb. 

FOCUS command 41/Aug. 

Form factor, illumination 80/Dec. 

Forty years of HP Journal 6/Oct. 

Four-color map problem 74/June 

Fractional-N frequency 

synthesis 18.28/Oct. 

Frame buffer 11,21/Dec. 

Frame engine 44/Feb. 
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Frequency agile signals 31,35/Feb. 

Frequency agile signal generator . 14/Ocl. 

Frequency analyzer 6/Feb. 

Frequency estimation 17.30/Feb. 

Frequency-locked loop 27,30/Oct. 

Frequency modulation 29,32,38/Oct. 

Frequency reference 68/Feb. 

Frequency response calibration .. 27/Apr. 

Frequency synthesis 27,37/Oct. 

Fresnel reflection 98/Oct. 

FURPS 83/Apr. 

G 

GaAs ICs 41/Oct. 

Gain calibration, ac 28/Apr. 

Gain errors 24/Apr. 

Gate arrays 32/Apr. 

Gating 9/Feb. 

Generic encapsulation 58/Aug. 

Global illumination modeling 78/Dec. 

Global inhibition 71/June 

Graded-index lens 54/June 

Grain structure 80/|une 

Graph sectioning problem 77/)une 

Graphics accelerator 20,74,87/Dec. 

Graphics context 11,24/Dec. 

Graphics, illumination modeling . 78/Dec. 
Graphics resource manager 

(GRM) 11,12/Dec. 

Graphics subsystem, VLSI 74/Dec. 

GRM daemon 16/Dec. 

Group-V passivation 71/Oct. 

H 

Hash indexes 12/June 

H-bridge 53/June 

Help facility 43/Aug. 

Help screen structure 21/Feb. 

Hemicube algorithm 81/Dec. 

Heterodyne output section 44/Oct. 

Hierarchical block design, HBD ... 63/Feb. 

High-resolution digitizing 39/Apr. 

High Sierra standard 52/Dec. 

High-speed IC testing 58/Dec. 

History, HP Journal 6/Oct. 

Holdoff 10/Feb. 

Hopfield neuron 69/June 

Hopping signal generator 14/Oct. 

Hop RAM 59/Feb. 

HP-H1L and testing 44/Dec. 

HP-HIL input devices 39/Dec. 

HP Journal, 40 years B/Oct. 

HP-UX driver debugging 76/Oct. 

HP-UX semaphores 16/June,26/Dec. 

HP-UX shared libraries 86/Oct. 

Hysteresis 26/Feb. 

I 

1C testing, transmission 

line effects 58/Dec. 

Illumination modeling 78/Dec. 

Image planes 11,33/Dec. 

Inguard section 31 /Apr. 



Inhibition 71/June 

InP/InGaAs/InP diodes 69/Ocl. 

Input amplifier 24/Feb. 

Input areas 13/June 

Instantaneous frequency 9/Feb. 

Integral linearity, ADC 14,22/Apr. 

Interpolation, time 40/Feb.,42/Apr. 

Interview techniques 92/Oct. 

J 

Joints, solder, surface mount 81/Oct. 

Josephson junction arrays 24/Apr. 

Journal. HP 6/Oct. 

K 

Keyboard/HP-HII. 

emulator 64/June,44/Dec. 

Keyword scanner 21/Oct. 

Kink, laser output 53/June 

L 

Laser Doppler vibrometrv 82/Oct. 

Laser measurements 41/June 

Lateral inhibition 71/June 

Launch, optical 53/June 

Leads, surface mount, unsoldered . 81/Oct. 

LED erase bar 98/Oct. 

Level accuracy 50/Oct. 

Light guides 98/Oct. 

Light pipes 100/Oct. 

Lightwave component analysis ... 35/June 
Lightwave sources and receivers . 52/|une 

Linear FM 32/Oct. 

Linearity, ADC 14,22/Apr. 

Links, trunk and access 44/Feb. 

Localizability 47/Aug. 

Locking strategy 25/Dec. 

M 

Mastering 54/Dec. 

Masters 28/Aug. 

McCabe's complexity metric 

64,66,85/Apr. 

Measurement objects 97/Aug. 

Memory board. 16M-byte 25/June 

Merge program 77/Oct. 

Merge system, Starbase/Xll 6/Dec. 

Messages and methods 19,89/Aug. 

Microwave extender output section 

49/Oct. 

Microwave signal generators 14/Oct. 

Millimeter-wave analysis 8/Feb. 

Mixer/detector 8/Feb. 

Model, electromigration 79/June 

Models, FET 56/Oct. 

Models, termination 59/Dec. 

Modular instrument systems 91/Apr. 

Modular signal generators 14/Oct. 

Modulation transfer function, 

lightwave 36,41/June 

Modulator, pulse 54/Oct. 



MOMA (multiple, obscurable, 

movable, and accelerated 

windows 11, 25/Dec. 

MPE source access system 50/Dec. 

MS-DOS objects 28/Aug. 

Multifunction synthesizer 52/Feb. 

Multimeter, BVi-digit 6/Apr. 

Multislope rundown 9/Apr. 

Multislope runup 10/Apr. 

N 

Network, voice and data 42/Feb. 

Neural data structures 69/June 

Neuron programming 69/June 

NewWave agent 32/Aug. 

NewWave application program 

interface (API) 32/Aug. 

NewWave computer-based training 
(CBT) 48/Aug. 

NewWave encapsulation 57/Aug. 

NewWave environment, 
overview 6/Aug. 

NewWave help facility 43/Aug. 

NewWave object management 
facility (OMF) 17/Aug. 

NewWave Office 23/Aug. 

NewWave windows 23/Aug. 

N-flops 70/June 

NMOS-III chip 62/Feb. 

Noise, ADC 13/Apr. 

Noise floor, optical measure- 
ments 49/June 

Noise, signal generator 27/Oct. 

Numeric data parser 70/Feb. 

Nusselt analog 82/Dec. 

o 

Ohject-based user interface 9/Aug. 

Object class 18,91/Aug. 

Object encapsulation B9/Aug. 

Object life cycle 19/Aug. 

Object links 10,18/Aug. 

Object management facility 17/Aug. 

Object model 11/Aug. 

Object models and views 94/Aug. 

Object module, HP-UX 78/Oct. 

Object-oriented 69,86/Apr. 

Object-oriented language 93/Aug. 

Object-oriented life cycle 98/Aug. 

Object-oriented technology 87/Aug. 

Object properties 18/Aug. 

Object-relationship diagrams 87/Apr. 

Object ive-C 95/Aug. 

Objects 70,86/Apr. 

Objects, graphic 13/Dec. 

Objects, NewWave 9.17/Aug. 

Office metaphor 12/Aug. 

Office, NewWave 23/Aug. 

Offscreen memory ll,15>Dec. 

Offset errors 23/Apr. 

Ohms calibration 25/Apr. 

On-the-fly counter readings 33/Feb. 

Optical device measurements 42/June 

Optical frequency-domain 

reflectometry 43/June 
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Optical reflection measurements 

Optical time-domain 

reflectometrv ...... 



Optical-to-electrical device 

measurements 

Optoelectronic erase bar _ 

Oscillator, fast hopping _ 

Oscillator. YIG-tuned 

Outguard section 

Output system, signal generator .. 

Overlay planes — „ — 11 

Oxide passivation 



-ij lune 

43 June 

36 June 
9a'Oct. 
34 Oct. 
39/Oct. 
31 Apr. 
42.'Or.t. 
.33Dec- 
70/Oct. 



"Packageless" microcircuits 44/Oct. 

Packets 43/Feb. 

Parser, command 22/Oct. 

Partially reflective light guides .... 98/Oct. 

Passivation, photodetectors 69/Oct. 

PC/CD-ROM source access system . 50/Dec. 

P-code 39/Aug. 

Peak detector 48/Oct. 

Performance signal generators 14/Oct. 

Phase digitizing 28/Feb. 

Phase-locked binary reference 

frequency 68/Feb. 

Phase-locked loop 27,45/Oct. 

Phase noise Z7.39/Oct. 

Phase progression plot 30/Feb. 

Phase modulation 59/Feb. 

Photodetectors, pin, high-speed .. 56/(une 

Photodetector processing 69/Oct. 

Photodiode measurements 42/|une 

Pin photodetectors 56/)une 

Pipeline, graphics 74/Dec. 

Pixel cache 76/Dec. 

Pixel processor 77/Dec. 

Pixel value 11 /Dec. 

Pixmap 11/Dec. 

Platform definition 90/Oct. 

Pointers, updating 79/Oct. 

Polymorphism 90/Aug. 

Port/HP-UX (PORT/RX) 86/Oct. 

Power compression measurements, 

laser 41/)une 

Precision Architecture computer, 

midrange 18/)une 

Precision Architecture. HP-UX 

shared libraries 86/Oct. 

Premastering 54/Dec. 

Processor board, midrange 

computer 19/June 

Program faults 51/Apr. 

Programming with neurons .... 69,72/|une 

Progressive refinement 86/Dec. 

Pulse modulation system 51/Oct. 

Pulse modulator IC 56/Oct. 

Q 

Quarter-inch cartridge tape drive . 67/Aug. 
Query/debug 17/)une 



Radiosity 79/Dec. 

Ray tracing 78/Dec. 



Reading storage, multimeter 37 Apr. 

Receivers, lightwave 52'lune 

Real-time data base 6 June 

Real-time firmware 79 Aug. 

Recognizing code quality 65/Apr. 

Reference frequency 68/Feb. 

Reference voltage 28/Apr. 

Reflection in light guides 98/Oct. 

Reflection measurements, 
uptical 42/June 

Reflection sensitivity measure- 
ments, laser 41/June 

Reflectivity, dielectric 98/Oct. 

Refractive index 98'Oct- 

Reliability, tape drive 74. Aug. 

Reliability. IC 79/June 

Reliability, software 75/Apr. 

Rendering 11/Dec. 

Resistive divider. IC testing 62'TJec. 

Resolution. ADC 13.39/Apr. 

Responsivity. electrooptical 
device 40/|une 

Result objects 99/Aug. 

Relurn loss measurements, 
optical 44/June 

Reusability 83/Apr. 

Reverse power protection 50/Oct. 

RF signal generator 14/Oct. 

RFI. signal generator 59/Oct. 

Routing, network 47/Feb. 



Sampling 9/Feb. 

Sampling, equivalent time 16/Apr. 

SA/SD and design process 54/Apr. 

Scan conversion 75/Dec. 

Scan paths 64/Feb. 

Semaphores 16/|une,17/Dec. 

Sequencer IC 38/Feb, 

Shared libraries, HP-UX 86/Oct. 

Shared memory 86/Oct-.ll.lZ/Dec. 

Sharing cursors 27/Dec, 

Sharing fonts 27/Dec. 

Sharing objects 16/Aug..l4/Dec. 

Sharing the color map 28/Dec. 

Signal generators 14/Oct, 

Signal handling, shared libraries . HB/Oct. 

Signature analysis 62/June 

Simulation, electromigration 79/|une 

Single-loop frequency synthesis 

16.39/Ocl. 

Slope responsivity 4()/|une 

Slot 0 Module 93.96/Apr. 

Snapshots 21/Aug. 

Software defect analysis 50/Apr. 

Software defect causes 59/Apr. 

Software defect data collection ... 57/Apr. 

Software defect perspectives 57/Apr. 

Software defect prevention 64/Apr. 

Software defect data validation .. 58/Apr. 

Software defect types 62/Apr. 

Software failure rate 75/Apr. 

Software process improvement ... 65/Apr. 

Software productivity 81/Apr. 

Software release goals 77/Apr. 

Software reliability 75/Apr. 

Software test tool 58/|une 



Solder joint inspection 

Source code access system 

Source code, lack of 

Sources, lightwave .... 



Spectra, lead vibration 

SPUs. HP Precision Architecture . 

SRX graphics subsystem 

Stacked screens mode 

Starbase 7 

State net 

State transition diagram 

Storyboard techniques 

Strip file 

Strip program „ 

Structured testing 

Structured analysis and 

structured design 54 

Structured methods 

Subsampling. synchronous 

Substructuring 

Super-blocking 

Surface mount leads, unsoldered . 
Surface mount process, 

double-sided 

Switching engine 

Symbolic debug, driver 

Synthesized signal generators 

System analysis 



Rl Dm 

50 Dec. 
76' Oct. 
5Z'|une 
83,Oct. 
18'lune 
74 Dec 
34 LI-. 
,B7 Da 
99 Apr 
80/Aug 
95/Oct. 
77 Oct 
77 Oct. 
83. Aug. 

.80/Apr. 
79.Aug. 
16/Apr. 
84/Dec. 
32/June 
81 /Oct. 

23/june 
44/Feb. 
76/Oct. 
14/Oct. 
86/Apr. 



Tape cartridge mechanics 69/Aug. 

Tape drive. Vj-inch 67/Aug. 

Tape drive, data compression 26/lune 

Tape head wear 74/Aug. 

Task automation 34/Aug. 

Task language, agenl 35,38/Aug. 

Task language parser 40/Aug. 

Tear/build engine 44/Feb. 

Temperature distribution. 

Ihin-film 81/June 

Termination models, IC test 59/Dec. 

Test plan 72/Apr, 

Test process 71/Apr. 

Test script 58/June 

Testing, Starbase/Xll Merge 42/Dec. 

Thermal control, laser 52/|une 

Throughput, mullimeter 31/Apr. 

Time interval analyzer 6/Feb. 

Time to failure, thin metal 

lines 82/June 

Time variation display 11/Feb. 

Tokens 21/Oct. 

Track density 70/Aug. 

Track-and-hold circuit 19/Apr. 

Track seeking 72/Aug. 

Transform engine 75/Dec. 

Transform, time-domain 38/|une 

Transmission line effects. 

IC testing 58/Dec. 

Transparency 75. 77/Dec. 

Traveling salesman problem 75/|une 

Trigger circuit 24/Feb. 

Transparent color 37/Dec. 

Troubleshooting, IIP 3000 50/Dec. 

Tuned dipole antenna 62/Ocl. 

Tuples 7/|uue 
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Turbo SRX graphics 

subsyslern 12,74/Dec. 

u 

IJART. custom 36,'Apr. 

Unit lasting 69/Apr. 

Unsoldered leads, surface mount . 81/Oct. 
User-centered application 

definition 90/Ocl. 

V 

Vectored interrupts 73/Feb. 

VCO, fast hopping 34/Oct. 

VCO. YlC-tuned 39/Oct. 

Vibration spectrum, SMT leads ... 83/Oct. 

Vibromelry, laser 82/Ocl. 

Video feedthrough 58/Oct. 

Videoscope 58/Junt; 

Video signature analyzer 62/June 



Views 20/Aug. 

Virtual circuits 43/Feb. 

Virtual instruments 96/Aug. 

VISTA 93/Aug. 

Visual type 12.35/Dec. 

VLSI, graphics 74/Dec. 

Voi(.e and data network 42/Feb. 

Void formation, electro- 
migration 81 /June 

Voltage reference, high-stability . 28/Apr. 

VMEbus 91/Apr. 

Vscope 59' June 

VXIbus 91/Apr. 

VXIbus development tools 96/Apr. 

W 

Waveform analysis library 47/Apr. 

Wave impedance 64/Oct. 



Windows, NewWave 23/Aug. 

WYSIWYG 10/Aug. 

X 

X driver interface (XDI) 9.12/Dec. 

X1 1 8/Dec. 

X server 6,12/Dec. 

X Window System 8/Dec. 

Y 

YIG-luned oscillator 39/Oct. 



z 

Z-buffer 75,76/Dec. 

Z-cache 76/Dec. 

Zero-dead-time counters 16,33/Feb. 



PART 3: Product Index 



HP E1400A VXIbus Mainframe Apr. 

HP El 404 A VXIbus Slot 0 Module Apr. 

HP E1490A VXIbus Breadboard Module Apr. 

HP E1495A VXIbus Development Software Apr. 

HP 3000 Series 935 Computer June 

HP 3458A Multimeter Apr. 

HP 5364A Microwave Mixer/Detector Feb. 

HP 5371A Frequency and Time Interval Analyzer Feb. 

HP 7980XC Tape Drive June 

HP 8644A Synthesized Signal Generator Oct. 

HP 8645A Agile Signal Generator Oct. 

HP 8665A Synthesized Signal Generator Oct. 

HP 8702A Lightwave Component Analyzer |une 

HP 8904A Multifunction Synthesizer Feb. 

HP 9000 Model 835 Computer |une 

HP 9000 Series 300/800 Turbo SRX 3D Graphics Subsystem . Dec. 
HP 9145A 'M-lnch Cartridge Tape Drive Aug. 



HP 11889A RF Interface Kit June 

HP 11890A Lightwave Coupler June 

HP 11891A Lightwave Coupler June 

IIP 82000 IC Evaluation System Dec. 

HP 83400A Lightwave Source June 

HP 83401A Lightwave Source June 

HP 83402A Lightwave Source June 

HP 83403A Lightwave Source June 

HP 8341 0B Lightwave Receiver June 

HP 83411 A Lightwave Receiver June 

HP 98646A VMEbus Interface Apr. 

HP NewWave Environment Aug. 

HP Real-Time Data Base June 

HP Starbase Graphics Library Dec. 

HP VISTA Aug. 

X Window System Version 11 Dec. 



PART 4: Author Index 



Akiyama, Tadashi Apr. 

Albin, Robert D June 

Andersen. Brad E Oct. 

Andreas, James R Dec. 

Barnes, James O Feb. 

Bartlett, Paul F Aug. 

Berlin. Lucy M Oct. 

Beucler, Dale R Feb. 

Bianchi. Mark J June 



Borowski, Donald T Feb.. Oct. 

Boyton, Jeff R Dec. 

Brockmann, Russell C June 

Bronstein. Kenneth H Dec. 

Brown. John M Dec. 

Burgoon, David A Dec. 

Cathell. B. David Dec. 

Ceely, Gary A Apr. 



Chakxabarti. Sankar L Dec. 

Chu, David C Feb. 

Cline, Robert C Dec. 

Coackley. Robert Feb. 

Conrad. Geraldine A June 

Crow, William M Aug. 

Curtis. G. Stephen Oct. 

Czenkusch, David A Apr. 
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Day. Anthony J. — — Aug, 

Dysart. John A Aug. 

Egan. Brian B. Aug. 

Elliot. Ian A _ ~ _ Dec 

Fatehi. Feyzi June 

Fiedler. Steven P Apr. 

Fischer. William A.. Jr Apr. 

Fletcher. Cathy _ Oct. 

Fried. Steve R _ Oct. 

Fries. Keith L. _ Oct. 

Fuller. Ian ) _ Aug. 

Giem. John Apr. 

Gilg, Thomas J Dec. 

Gills. David Aug. 

Givens. Cynthia „ June 

Goeke. Wayne C Apr. 

Grady, Robert B Apr 

Mains, Tracey A Aug. 

Hanson. Scolt A Aug. 

Hargis. Jeffrey G June 

Hart. Michael G June 

Heikes. Craig A Feb. 

Heinzl, Johann I Feb. 

Helmso, Bennie E Oct. 

Herleikson. Earl C Oct. 

Hernday. Paul June 

Hiebort. Steven P Dec. 

Higgins. Thomas M., Jr Feb. 

Ho. Donna Apr. 

Hong, Le T June 

Hoover. David M Oct. 

Ives, Fred H Feb. 

Jencek. John ) Aug. 

Jessen, Kenneth Apr. 

Jones, Carolyn F Oct. 

Josl, James W Apr. 

Kulsteln. Michael B Dec. 

Kanago. Kerwin D Oct. 

Kato. Jeffery J June 

Keely. Catherine A Oct. 



Keller. John „.._ June 

Kraemer. Thomas F ..... Aug. 

Kruger. Gregory' A. — _ Apr. 

Kurtz. Barry D _ Apr 

Lam. Beatrice Aug, 

Lang. John J . Dec 

Leath. Charles L. — Oct. 

Leyde. Kent W. ..„ „„„„... June 

Light. Michael R_ June 

Liu. Ching-Chao _„................—....■.. June 

Loomis. Courtney „ Dec. 

Low, Danny — June 

Lynch-Freshner. Lawrence A. Aug. 

Marchington. Keith A Dec. 

Marcoux. Paul J June 

Martelli. Anastasia M Oct. 

McCabe. Thomas J Apr. 

McCormick. Alan L Feb. 

Mcjunkin. Barton L Oct. 

McNamee. Michael D Oct. 

Merchant. Paul P June 

Meyer. Thomas O June 

Moore. Floyd E June 

Nakajn, Takeshi Apr. 

Naroditsky. Vladimir June 

Nimori. Torrance K Feb. 

Owen, Jens R Dec. 

Packard. Barbara B Aug. 

Pearce. Stephen J Dec. 

Piatt, David L Oct. 

Plitschka. Rainer Dec. 

Rawson. Rollin F June 

Rehder. Wulf D June 

Robinson. Paul F Aug. 

Robinson. Peter R. Dec. 

Ruslici. David | Apr. 

Sachs. George M Dec. 

Sasabuchi, Kalsuhiko Apr. 

Schneider. Richard Feb. 



Schwartz. David J 


._ Feb. 


Shackleford. |. Barry 


. June 






Showman. Peter S 


~ Aug. 


Simms. Mark J 


Aug. 




Oct. 


Sloan. Susan 


June. Oct. 


Smith. David E 


Miiu Apr 


Snook. Douglas R. 


i, Oct. 




Aug. 


Stambsugh. Lisa B. 


Feb. 


Stambaugh. Mark A 


Oct. 


Steadman. Howard t- — 


Feb. 






Stephenson. Paul S 


Feb. 






Slrnvan KffifunAfu H 


Dec 


Summers. James B 


Oct. 




Dec. 








Apr. 




. Feb. 


Tanner. Eve M 


Oct. 




Dec. 




Feb. 




Aug. 


Tuttle. Myron R 


June 








Apr. 






Wailz. John A 


,, Dec. 


Wall, Teresa A 




Ward. William T 




Watkins. Brian D 


Oct. 


Watson. R. Thomas 


Aug. 


Wechsler. Mark 


Feb. 


Mftralnn Charles H 












Wright, Larry R 


Feb., Oct. 


Wright, Michael J 


June 


Voder, William R 


Dec. 
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Custom VLSI in the 3D Graphics Pipeline 



VLSI transform engine, z-cache, and pixel processor chips 
widen bottlenecks in the pipeline to allow the HP 9000 Series 
300 and 800 TurboSRX graphics subsystem to deliver 
enhanced performance compared to the earlier SRX 
design. 

by Larry J. Thayer 



PRODUCTS FOR DISPLAYING 3D GRAPHICS an en- 
gineering workstations have been appearing at an 
ever-increasing rate over the last few years. Products 
of each succeeding generation are much more interactive 
and have significantly more capabilities than earlier ones. 
Fueling the fast-paced change are new algorithms, better 
architectures, and most Important, advances in VLSI (very 
large-scale integration) processing and design. 

Within HP, perhaps the first use of a custom VLSI chip 
for computer graphics applications was in a graphics dis- 
play for a desktop computer introduced in 1981. The chip 
accelerated vector drawing on HP 9845B Computer dis- 
plays. Our first 3D product, the HP 98700A. introduced in 
1985, drew fast wireframe images with the aid of special 
commercially available video RAM chips. These chips al- 
lowed the raster display to be refreshed at the same time 
the image was changing. 

HP's first solids modeling graphics subsystem, the HP 
9000 Series 300 and 800 SRX. was introduced in 1986. It 
uses a proprietary HP process (NMOS-III) to build chips 
for floating-point operations (essential for fast 3D graphics) 
and for the scan conversion process (polygon and vector 
drawing).' Another proprietary process (LTCMOS) is used 
for a chip that caches pixels, thus allowing multiple pixels 
to be changed per RAM cycle. 2 The upgrade system for the 
SRX, the TurboSRX. introduced in 1988. uses even more 
VLSI for increased performance and functionality. 

Custom VLSI is the technology of choice for producing 
interactive 3D graphics for several reasons: 
■ VLSI devices are a capable source of the very high com- 



putation rates needed for fast, interactive graphics. (The 
scan converter chip used in both the SRX and the Tur- 
boSRX is capable of performing over 300 million addi- 
lions per second.) 

■ Data flow is pipelined, with each point in the pipeline 
having a particular function. VLSI chips can be tailored 
to each function. 

■ The low-cost potential provided by large-scale integra- 
tion makes interactive 3D graphics capability available 
in a workstation that an engineer can afford. 

This article describes how the 3D graphics pipeline of 
the SRX was analyzed, and how custom VLSI was used in 
the next-generation product, the TurboSRX, to improve the 
overall graphics performance. 

Pipeline Stages 

Graphics workstations contain a data pipeline for dis- 
playing user graphics data bases (see Fig. 1). The source 
data is stored in the host system memory, typically in a 
display list format. This list is simply a file containing a 
hierarchical list of the graphics primitives needed to draw 
the image. First in the pipeline is the system CPU. which 
reads the display list and sends commands to the graphics 
subsystem. Using the main system CPU for display list 
processing minimizes system cost and allows the size of 
the display list to be limited only by the virtual memory 
space of the processor. 

Next in the graphics pipeline is the transform engine 
block, which resides in thegraphics subsystem and consists 
of one or more microcodable processors (called transform 





Fig. 1 . The 3D graphics pipeline 
in the HP 9000 Series 300 and 800 
TurboSRX graphics subsystem 
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engines). The transform engine block performs matrix mul- 
tiply calculations for positioning the image in three-dimen- 
sional space, clips the image to the viewing window, cal- 
culates polygon vertices for parametric surface commands, 
and applies lighting calculations for realism. 

When the transform engines have finished all necessary 
calculations, they send the polygon and vector endpoints 
(in integer device coordinates) to the scan converter. The 
function of the scan converter is to draw the individual 
polygons and vectors into the frame buffer where they can 
be viewed. In the scan conversion process, each pixel in 
the polygon is calculated individually to determine its x. 
y. z. red. green, and blue values. The x and y values deter- 
mine the pixel's location on the screen, the color values 
allow smooth shading of colors, and the z values are sent 
to the z-buffer for hidden-surface removal. 

After the pixels have been calculated, a dither circuit 
operates on the color values to provide a greater number 
of apparent colors, thus allowing true-color images with 
as few as eight graphics planes. (When 24 planes of frame 
buffer memory are available, dithering is not used.) Trans- 
parency is implemented by drawing alternate pixels of the 
transparent surface, a technique known as "screen door 
transparency." The technique gets its name from the screen- 
door-like pattern used to determine which pixels to draw. 

Z-buffering is a general-purpose approach to hidden-sur- 
face removal. The z-buffer is simply RAM in which 16 bits 
are allocated for each pixel on the screen. It works by com- 
paring the z value (depth) of the pixel being drawn to that 
of the pixel already present at that location, if any. If the 
new pixel is closer, it is drawn to the frame buffer and the 
z value is updated to that of the pixel being drawn. If it is 
farther away, the pixel is not drawn and the z-buffer is not 
updated. 
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Comparative Performance 

Because the SRX was the first product of its caliber, there 
were many unknowns about how the product would be 
used and how it would perform. Since then, much has 
been learned from our customers and from our own 
analyses about what features are commonly used and what 
sizes of polygons are typically drawn. For the purpose of 
illustration, we will examine two kinds of polygons: small 
polygons (defined as being 20 x 20-pixel unconnected 
quadrilaterals) and large polygons (defined as being larger 
than 200x200 pixels). The performance metric for small 
polygons is polygons per second, and large polygons are 
measured in pixels drawn per second. 

Figs. 2 and 3 show the relative performance of different 
stages in the pipeline. It is important to keep in mind that 
since the graphics architecture is organized as a pipeline, 
the performance of the system is determined by the slowest 
block in the sequence. Note that for small polygons the 
transform engine block limits the performance on the SRX, 
with the z-buffer being the next limiter. For large polygons, 
the z-buffer is the primary culprit, but the dither transpar- 
ency circuit is right behind. 

It was clear from examining the data that to improve 
performance significantly for both cases, it would be neces- 
sary to change more than one functional block. 

Transform Engine 

Each transform engine consists of a microcodable proces- 
sor and floating-point chips. (In both the SRX and the Tur- 
boSRX, NMOS-III floating-point chips are used.) Because 
of the many intricate, sophisticated algorithms necessary, 
it was decided that for the TurboSRX this function should 
be implemented in the same general-purpose fashion as in 
the SRX. The approach taken was to use multiple higher- 
speed transform engines to gain performance. Product 
packaging limitations prevented a faster discrete imple- 
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Fig. 2. Relative performance ot 3D graphics pipeline stages 
lor small polygons. 



Fig. 3. Relative performance ot 3D graphics pipeline stages 
lor large polygons. 
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mentation using bit-sliCB hardware, so an NMOS-III VLSI 
chip was designed lo enable three improved transform en- 
gines to fit into the product. It was dubbed TREIS, which 
stands for TRansform Engine In Silicon. Integration pro- 
vides both reduced size and increased performance. 

Each transform engine contains the lull sel of microcode, 
so any transform engine can execute any graphics opera- 
tion. One transform engine acts as the master, distributing 
graphics commands among the three transform engines. 
Any command can therefore be distributed to the next free 
transform engine, including the master. 

The result is more than a threefold gain in the raw 
hardware performance in the transform engine stage of the 
pipeline for small polygons (see Fig. 2). By adding im- 
proved microcode and software and some higher-level 
functions, performance levels up to ten times that of the 
SRX can be achieved. One higher-level function, quadri- 
lateral mesh, allows the vertices of adjacent quadrilaterals 
to be transformed, clipped, and lighted a single time, result- 
ing in a net reduction of processing by almost a factor of 
four. 

TREIS (see Fig. 4) is a custom NMOS-III chip containing 
about 170,000 transistors, including 1536 bytes of pointer 
RAM and an ALU. in a 272-pin pin-grid array (PGA) pack- 
age. It outputs a lb-bit microcode address and reads a 68-bit 
wide microcode word with highly pipelined architecture. 
It improves performance over the SRX transform engine 
by combining some two-state activities into one state. Like 
the SRX transform engine, it connects to HP-proprietary 
floating-point math chips through a 32-bit floating-point 
bus for accelerated transformation, clipping, lighting, and 
parametric surface calculations. The connection to the 
polygon-rendering chip ts through a double-buffered RAM 
containing polygon and vector vertex addresses, z values, 
and color data. 




Fig. 4. TREIS {TRansform Engine In Silicon) chip. 
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Z-Buffer 

Once the transform-engine bottleneck was improved, the 
next performance limitation for small polygons was the 
speed of the z-buffer. The SRX"s z-buffer is in the non- 
displayed part of the frame buffer. (The frame buffer holds 
2048 x 1024 pixels, but only 1280 * 1024 can be displayed 
at a time. Most of what is not displayed can be used as a 
z-buffer.) While this approach minimizes the cost of low- 
end systems, maximum performance cannot be obtained 
when frame buffer and z-buffer accesses cannot be done at 
the same time. 

When drawing with the z-buffer enabled, the SRX must 
read the z value from the frame buffer, compare the z value 
of each pixel with the z value present at that location, write 
the new z value back into the frame buffer if necessary, 
and write the pixels into the frame buffer if necessary. 
Using pixel caching allows each access to handle up to 
eight pixels (the size of a frame buffer "tile") simultane- 
ously. 

Most of the z-buffer overhead was eliminated by provid- 
ing an optional dedicated z-buffer. which allows z-buffer 
RAM cycles and frame buffer RAM cycles to occur in paral- 
lel. In this dedicated z-buffer is another custom chip, the 
z-cache. which allows multiple z values to be fetched and 
stored in a single RAM cycle, increases the tile size, and 
performs comparisons of z values at a rate twice as fast as 
the SRX. 

The z-cache is an LTCMOS standard-cell design contain- 
ing about 3700 gates, packaged in a 68-pin plastic leaded 
chip carrier. It is similar in design and size to the pixel 
cache. 2 It performs fast z comparisons and allows multiple 
z-buffer operations to take place in a single RAM cycle. 
One chip per plane is used in the z-buffer. 




Fig. 5. Pixel processor chip. 
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Pixel Processor 

The z-cache chip is still not enough to prevent the z-buf- 
fer from limiting overall performance, particularly for large 
polygons. 

In the SRX. whenever a new pixel needs to be written 
into a tile other than the one accessed by the previous tile, 
the polygon-rendering chip is held from drawing any more 
pixels until the new z tile is read. A third custom chip, 
the pixel processor, was added between the polygon-ren- 
dering chip and the z-buffer. It removes that latency by 
issuing an early warning when a new tile will be needed. 
This signal is provided far enough in advance of the pixel 
that the z values can be fetched from the z-buffer before 
the pixels are drawn. To match the output of the polygon- 
rendering chip with the z-buffer better, a FIFO buffer was 
added at the output of the pixel processor. This way, both 
the polygon-rendering chip and the z-buffer can operate 
more efficiently. 

The pixel processor (see Fig. 5) is a custom NMOS-III 
chip containing aboul 1 10,000 transistors in a 168-pin PGA 
package. As mentioned earlier, it contains performance im- 
provement features such as the fast dither and transparency 
operations, the FIFO control, and the early z read signal 
to prevent slowing down the polygon-rendering chip. In 
addition, it contains three 1024-byte gamma-correction 
ROM tables for more accurate color representation, and 
window clipping operations for up to 32 movable, obscur- 
able. overlapping, accelerated graphics windows. A 
pipeline valve inside the chip allows fast window opera- 
tions without emptying the graphics pipeline. All pixel 
operations inside the pixel processor are performed at the 
polygon-rendering chip's pixel output speed, so the 
graphics throughput does not slow down when using any 
of its features. 

Notice in Figs. 2 and 3 that these /.-buffer enhancements 
improve that porlion of the pipeline for small polygon per- 
formance by about 50% and for large polygons by a factor 
01 lliree. 

Dithering and Transparency 

With z-huffei operations streamlined, there was one more 
stage in Hie pipeline left (0 be improved. Dithering and 
transparency in the SRX are performed with discrete TTL 



logic. While this does not show up as a performance limiter 
in the SRX because it is faster than the z-buffer (see Figs. 
2 and 3). it would have become the limiting factor in the 
TurboSRX with the fast z-buffer. Instead of leaving the 
dither and transparency circuits in TTL. it was decided to 
include those functions in the pixel processor. This both 
improves the dither transparency performance by a factor 
of two for large polygons (Fig. 3). and improves the reliabil- 
ity and cost of the overall system. 

Conclusions 

Figs. 2 and 3 reveal that no stage of the TurboSRX 
pipeline is significantly slower than the others for either 
small or large polygons. Since the pipeline is fairly well 
balanced, it might appear that higher performance would 
require that all parts of the pipeline be replaced, requiring 
a large amount of product development time and cost. How- 
ever, as VLSI technology improves, so does the potential 
improvement of 3D graphics subsystems. Several areas of 
VLSI technology have been improving lately, including 
speed, density, packaging, and design productivity. Fur- 
thermore, the experience gained on earlier products has 
pointed the way toward new and better algorithms and 
architectures. Future graphics products will clearly have 
to take advantage of these latest advances to meet growing 
customer expectations. 
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Global Illumination Modeling Using 
Radiosity 

Radiosity is a complementary method to ray tracing for 
global illumination modeling. HP 9000 TurboSRX graphics 
workstations now offer three illumination models: radiosity, 
ray tracing, and a local illumination model. 

by David A. Burgoon 



IN COMPUTER GRAPHICS image generation systems, 
an illumination model can be invoked locally or glob- 
ally. When invoked locally, only incident light from 
light sources and object orientation are considered in deter- 
mining the intensity of light reflected to the observer's eye. 
Invoked globally, the light that reaches an object by reflec- 
tion from or transmission through otber objects in the scene 
environment is also considered. 

Local illumination models are popular because they pro- 
duce reasonably realistic rendering and can be computed 
at interactive rales using hardware acceleration techniques. 
Global models are usually used when rendering realism is 
of primary importance. Traditional global illumination 
modeling methods are extremely computationally inten- 
sive. As a result, interactivity is usually sacrificed for the 
sake of realism. 



One of the most familiar local illumination models is 
that of Phong.' Turner Whitted 2 enhanced the Phong model 
for global use in ray tracing by accounting for the light 
reflected or transmitted from other objects in the environ- 
ment. 

In the ray tracing procedure, an intersection tree is con- 
structed by tracing a ray from the observer's eye through 
each pixel into the environment. At each surface intersected 
by the ray, two branches are added to the tree, representing 
the spawned reflected and transmitted rays. Each surface 
intersection is represented by a node in the tree. This pro- 
cess is repeated recursively. The final pixel intensity is 
determined by traversing the tree starting with the leaves 
and working toward the root, computing the intensity con- 
tribution of each node using the illumination model. The 
final pixel intensity is the sum of all of these contributions. 




Fig. 1. These gears were gener- 
ated on the HP ME Series 30 mod- 
eling, design, and drafting system. 
The ray traced image was ren- 
dered using nonuniform rational 
B-splmes- It is a polygonal repre- 
sentation with 3084 polygons and 
12 partial polygons. 
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Fig. 1 shows an example of a 3D image generated using 
ray tracing. 

Ray tracing is an important rendering method. It has 
produced some of the most realistic images ever seen to 
date. However, it is not without its deficiencies.' 4 In the 
ray tracing method, realistic shadows are difficult to pro- 
duce. In particular, penumbras and shadow envelopes are 
seldom seen in ray traced images Most ray tracing Tender- 
ers produce sharp shadow boundaries only. 

Most ray tracing systems limit themselves to modeling 
only point light sources, that is. light sources assumed to 
emit light that originates from a single point in space. Light 
sources whose emission comes from a finite area are not 
readily treated by the method. Only some of the more recent 
and exotic methods, such as distributed ray tracing and 
ray tracing with cones, attempt to deal with this limitation. 

The reflection models used in ray tracing are usually 
empirical and approximate. They are often chosen based 
On subjective results rather than physical laws of energy 
equilibrium. This disallows the modeling of effects such 
as color bleeding, where diffuse reflection from one surface 
causes a soft colored shadow to be seen on another. 

Another problem with ray tracing is that it is inherently 
slow. The computational expense of recursively tracing 
rays for each pixel on a screen with reasonable resolution 
(e.g.. 1280 by 1024 pixels) can be prohibitive. Furthermore, 
since the scan conversion and global illumination model- 
ing functions are very tightly coupled, any hardware op- 
timized for scan conversion that may be available is not 
used. The view dependent nature of the ray tracing al- 
gorithm also detracts from the interactivity of the system 
employing it. Each change in the viewing transformation 
requires that the entire ray tracing process be repeated to 



render the new view. 

Perhaps the most fundamental flaw of the ray tracing 
method is that it limits itself to modeling intraenvironment 
reflections in the specular direction only. Global modeling 
of diffuse effects is ignored. 

Radiosity 

The radiosity method, introduced by Goral and others. 5 
corrects most of the above deficiencies, but at the expense 
of introducing some restrictions of its own. The method 
correctly models the interaction of light between reflecting 
surfaces if the surfaces are restricted to be perfectly diffuse. 
It replaces the constant ambient term in Phong's model 
with an accurate global model. Radiosity has a fundamental 
energy equilibrium basis, and is derived from methods used 
in thermal engineering. Fig. 2 shows a 3D image generated 
using the radiosity method. 

In the radiosity method, a (possibly hypothetical) enclo- 
sure is constructed around the environment to be rendered. 
The surfaces or walls of the enclosure completely define 
the illuminating environment. They consist of light sources 
and reflecting walls. One or more of the surfaces of the en- 
closure may be fictitious (e.g.. an open window). Each of 
the surfaces is assumed to be an ideal diffuse reflector, an 
ideal diffuse emitter, or a combination of the two (Fig. 3). 

The radiosity method deals with the equilibrium of 
radiant energy within the enclosure. The light (or radiosity. 
which is measured as energy/time/area) leaving a surface 
i is B|. It consists of direct emission Ej from the surface 
plus the reflected portion of light arriving at the surface. 
The light arriving at i, H,. is found by summing the contribu- 
tions from the other N- 1 surfaces, and from surface i if it 
is concave. Note that there is no need to treat the emitted 




Fig. 2. This radiosity image at a 
cathedral with eight bays ol win- 
dows and columns was done lor 
two bays, and the remaining bays 
were generated by a step-and- 
repeat process It took 7 minutes 
ot preprocessing on an HP 9000 
Model 350 to build the data base 
and subdivide the polygons (mesh- 
ing), and 12 minutes per step lor 
40 steps (using progressive re- 
linement) to generate the image 
(5 minutes and 8 minutes, respec- 
tively, on an HP 9000 Model 370) 
There are 9916 polygons (14,316 
alter meshing). 26 area lights, and 
lour point lights 
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and reflected energy separately because they are both per- 
fectly diffuse and therefore indistinguishable to the ob- 
server. Unlike ray tracing, the history or direction of a ray 
is lost after reflection from a surface. 
The total radiosity leaving a given surface i is therefore 



Bj = E, + ftH,, 



ID 



where B, = 



radiosity of surface i. This is the total rate at 
which radiant energy leaves the surface in 
terms of energy per unit time per unit area 
(watts per square meter), 
rate of direct energy emission from surface i 
per unit time per unit area, 
reflectivity of surface i. This represents the 
fraction of incident light that is reflected 
back into the hemispherical space surround- 
ing surface i. 

incident radiant energy arriving at surface i 
perunit time perunit area (watts per square 
meter). 

Hi is the sum of all the light leaving the N surfaces of 
the enclosure that "see"' surface i. The fraction of the 
radiant energy leaving a surface j that impinges on surface 
i is specified by the form factor or configuration factor Fp 
The energy per unit time arriving at surface i is therefore 



Bj = 



Pi = 



HiAi = % BjAjF||, 



(2) 



Surface i 



(a) 




Surface j 



N Surfaces 




— BjF,, (Total Impinging Energy 
per Unit Area) 



*- E| (Emission) 



A-B.F,, (Total Reflected 
Energy per 
Unit Area) 



I B, (Radiosity) 



(b) 



Fig. 3. Radiosity relationships, (a) Copyright © 1984 by 
Goral. Torrance. Greenberg, and Battaile. Used with permis- 
sion (b) Copyright ■© 1986 by Greenberg. Used with permis- 
sion. 



where A> is the area of surface i. Dividing through by A, 
we have 

H, = | B, S§2 (3) 

According to the reciprocal nature of form factors. 5 

A,F„ = Afp. (4) 

Therefore. H, is 



h, = iBjfy. 

1-1 

Thus the radiosity at a surface i is 

B, = E, + P[ I BjFjj. 
This may be rewritten as 



B, - A 2 B|Fj| = E„ 



(5) 



(6) 



(7a) 



or, for i = 1, 



-PiF 12 ... -p,F 1N ] 



Considering all N surfaces i we have 



B, 

B, 



= % ( 7b ) 
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(7c) 



This system of N linear equations with N unknown val- 
ues Bi has parameters E,, p i% and V Vy which must be known 
or calculated for each surface. The Ej are nonzero for sur- 
faces that provide illumination to the enclosure. Such sur- 
faces could represent a diffuse area light source or panel, 
or the first reflection of a directional light source from a 
diffuse surface. If all of the E, are zero, then there is no 
illumination and all of the B, are zero. 

In general, the E, and p, are functions of the wavelength 
of the light. They are usually chosen to represent an average 
value over a bandwidth of radiation, typically red. green, 
and blue. Once the form factors are calculated, the above 
matrix equation is solved numerically for the B values for 
each of three sets of E, and p ; parameters. 

The above equation is well-suited to solution using an 
iterative Gauss-Siedel technique 6 because it is diagonally 
dominant, that is. the sum of the absolute values of the 
nondiagonal coefficients in each row is less than the abso- 
lute value of the main diagonal term. The solution usually 
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converges in six to eight iterations. 

The aforementioned surfaces generally are not the same 
as the surfaces of the representation chosen for the geomet- 
ric model of the scene. For example, if objects are described 
using polygons, the polygons are usually subdivided into 
patches or elements | i.e.. smaller polygons). These patches 
become the surfaces of the enclosure. 

Once the radiosity for each primary color for each patch 
has been found, it is mapped onto the vertices of its as- 
sociated polygon so that the vertex radiosity (color) values 
can be bilinearly interpolated across the polygon using 
either Gouraud shading' or object-space interpolation. A 
good way to do this is to set the radiosities at the vertices 
of patches that are interior to a given polygon to the average 
of the adjacent patch radiosities and then extrapolate out- 
ward to the polygon vertices. 

The process of image generation using the radiosity 
method can be summarized as follows: 

■ Take the input geometry and subdivide it into patches, 
a Calculate form factors. 

■ Solve for the B, for each primary color. 

■ Extrapolate the B, to polygon vertices and render. 
Once the form factors are calculated, they need not be 

recalculated if colors (p) or light sources (E) change. Also, 
as long as the geometry of the objects remains static, 
dynamic views of the scene can be generated by merely 
rerendering. This can be highly interactive on a workstation 
such as the HP 9000 Model 835 TurboSRX. which has 
dedicated hardware optimized for polygon rendering. 

Form Factor Calculation 

We now consider the calculation of F„, the fraction of 
the energy leaving surface i impinging on surface j (Fig. 4). 
Because our surfaces are assumed to be perfectly diffuse, 
the form factor is purely geometrical in nature. It depends 
only on the shape, size, position, and orientation of the 
participating surfaces. 

For nonoccluded environments, the form factor from one 
differential area (i) to another (j) is given by 




Fig. 4. Form laclor geometry Copyright 0 1986 by Green- 
berg Used with permission. 



Integrating over area A,, the form factor to a finite area or 
patch is 

Jr cosAcosA , , 
^— ^--'dA, .9) 

The form factor between finite surfaces (patches) is defined 
as the area average and is thus 

1 r f C0s4i cos<6, . 

sli^Sr***** 1101 

From the symmetry of this equation we can derive the 
reciprocal relationship given in equation 4. Some other 
important properties of form factors are: 

■ From the law of conservation of energy: 

■ For any surface that does not see itself (planar or convex): 

Fjj = 0. (12) 

The Hemicube Algorithm 

For occluded environments, equation 10 becomes 

1 f c cos<& cosA , . , . 

A, X X, V? ,H ' D| dA ' dA " 

where the Boolean function HID takes on the value 1 or 0 
depending on whether dA, can see dAj. This double area 
integral is difficult to solve analytically for general cases. 
An area integral, which is a double integral itself, can be 
transformed via Stokes' theorem into a single contour inte- 
gral, which can then be evaluated numerically, but at con- 
siderable computational expense. Numerical approxima- 
tion techniques can provide a more efficient means to com- 
pute form factors for general complex environments. The 
hemicube algorithm" employs such a numerical method 
and also addresses how to deal with the HID function. 

Inner Integral Approximation 

If the distance between the two patches i and j is large 
Compared to their areas, and if they are not partially 
occluded from one another, the integrand of the inner in- 
tegral of equation 13 remains almost constant over the area 
A,. If we let K approximate the inner integral we have 

F H = F A,A, 

■ If KdA, = K *S = K = f ^P&dfr. (14) 
A, 4\, A, Ja, wt 

Thus finding a solution for the inner integral K, the differ- 
ential-area-to-finite-area form factor, equation 9. will pro- 
vide a good approximation for the form factor from patch 
to patch. If the patches are close together relative to their 
size, or if there is partial occlusion, the patches must be 
subdivided into smaller patches until equation 9 provides 
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a good approximation. 

The Nusselt Analog 

To see how to evaluate the form factor integral numeri- 
cally, Nusselt's geometric analog' 1 to the form factor integral 
is helpful (Fig. 5). 

Each differential area patch has its own view of the en- 
vironment, which is the hemisphere of directions sur- 
rounding its normal. For a finite area, the form factor is 
equivalent to the fraction of the circle (which is the base 
of the hemisphere of directions) covered by projecting the 
area onto the hemisphere and then orthographically down 
onto the circle. 

The easiest way to see that this analogy correctly de- 
scribes the form factor integral is to think of it in terms of 
solid angles and projected areas. The area of plane A that 
is seen by or projected onto plane B is the area of A times 
the cosine of the angle between the normals of the two 
planes. It is equal to the area of the shadow that A would 
cast onto B. The solid angle can be thought of as a general- 
ization of the planar angles with which we are familiar. 
Recall that a planar angle 0, measured in radians, is defined 
to be equal to the length of the arc subtended by the angle 
divided by the radius r of the circle containing the arc. Since 
the total circumference of a circle is 2-nr. there are 2tt radians 
in a circle. Similarly, a solid angle u>. measured in stera- 
dians, is defined to be the area subtended by the solid angle 
divided by the square of the radius of the sphere containing 
the area. Since the total area of a sphere is 4irr 2 , there are 
4n steradians in a sphere (and 2ir steradians in a hemi- 
sphere). Stated another way, one steradian subtends a unit 
area of a unit sphere. 

Now. returning to our inner integral approximation, 
equation 9, the solid angle dtu that subtends the infinites- 
imal area HA, is (see Fig. 6): 




(151 



where S,| Aj is the portion of the area of the sphere with 
radius r that is projected by dA, in the direction of r. Since 
dA, is infinitesimally small, this projected area is planar, 
and is given by cos ^dA,. Thus, diu is 



COS& dA, 



(16) 



Now, returning to the unit hemisphere of the Nusselt 
analog, the area on the unit hemisphere projected by d A, is 



(solid angleJIradius 2 ) = dio)!") = dio = 



cos<£jdA 



1 '' 



1? 



The projection of this area onto the base of the unit hemi- 
sphere is 



, i,u i cosrf>j coscftj dAj 
(cos^Mdio) = ^ — — L 



118) 



Taking the ratio of this area to the total area of the base of 
the unit hemisphere (jt) we have, as before, the differential 
form factor 



'dAidA, 



cosiftjCosifydAi 
Trr 2 



(19) 



Integrating these differential form factors over A, and 
then taking the area average of this integral gives us the 
double area integral expressed in equation 10 for the form 
factor F u . Using the inner integral as an approximation is 
equivalent to using the center point of patch i to represent 
the average position of patch i, constructing a unit hemi- 
sphere around this point, and summing the differential 



Normal 



Unit 
Radius 



Area 
Normal 



z 



Fig. 5. The Nusselt analog. The lorm factor is equal to the 
traction of the base ol the hemisphere covered by the projec- 
tion. Copyright & 1985 by Cohen and Greenberg. Used with 

permission. 




Hemisphere of 
Directions 



Patch i 



Fig. 6. The area dAj is taken to be the area ot patch j that 
is visible through the solid angle du> Adapted from Wallace. 
Used with permission 
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form factors 

Delta Form Factors 

To approximate the inner integral, the hemisphere of 
directions can be divided into discrete solid angles Au>. 
and a delta form factor can then be calculated: 



cos<fr|Cos< ft,AA| 



120) 



The form factor P« can then be approximated by summing 
the delta form factors covered when projecting patch j onto 
the unit hemisphere surrounding the center of patch i. If 
all the patches in the environment are projected onto the 
hemisphere, discarding the projections of the more distant 
patches in the case of two or more patches with overlapping 
projections, the sums of the delta form factors covered by 
these projections give the form factors from all patches to 
the patch represented at the center of the hemisphere. This 
procedure intrinsically includes the effects of hidden sur- 
faces. 

To make this procedure practical, a convenient means 
of dividing the surface of the hemisphere into discrete areas 
(subtended by the discrete solid angles Aio) is needed. The 
delta form factors for each of these discrete areas could be 
precalculated and stored in a lookup table. An evaluation 
can then be made as to which patch projects onto a given 
discrete area. For a given patch j. the form factor calculation 
problem is reduced to determining through which of the 
discrete solid angles Aw surface j is visible. Unfortunately, 
for a hemisphere, it is difficult to devise a method of creat- 
ing equal discrete areas and a set of linear coordinates to 
describe the locations of these areas uniquely. 




Fig. 7. Areas with identical lorm factors Areas A, B. C. D, 
and E all have the same form factor Copyright s '985 by 
Cohen and Greenberg Used with permission. 



The Hemicube 

It would be handy if we could choose a more convenient 
surface than a hemisphere to project the patches onto. From 
the Nusselt analog it can be seen that any two patches in 
the environment that project onto the same set of discrete 
areas of the hemisphere will have the same form factor 
value. Said another way. any two areas that are seen 
through the same set of delta solid angles will have the 
same form factor. In Fig. 7. E is the set of discrete areas 
and A. B. C. and D all have the same form factor. Consider 
area D. If we allow D to be part of the top part of a cube 
surrounding the patch i of interest, we can determine the 
form factor from patch i to the patch with area A by calculat- 
ing the form factor to the patch on the cube with area D 
from patch i. Thus, instead of projecting directly onto the 
unit hemisphere, we can first project onto a "hemicube" 
and then calculate the form factor of the intermediate patch 
that has area equal to the projected area of the original 
patch. 

More specifically, an imaginary cube is constructed 
around the center of the patch i of interest |Fig. 8). The 
environment is then transformed to set patch i"s center at 
the origin (eye) with the patches normal coincident with 
the positive Z axis (assuming a left-handed coordinate sys- 
tem). The cube is sized so that the perpendicular distance 
from the center of the patch to the surface of the cube is 
1. In this orientation, the aforementioned unit hemisphere 
is surrounded by the upper-half surfaces of the cube, the 
lower half being below the horizon of the patch. One full 
face, facing in the +Z direction, and four half faces, facing 
in the rX and ±Y directions, replace the hemisphere. 
These faces are divided into square discrete areas (pixels) 
at some resolution, usually between 50x50 and 100 x 10 
and the environment is then projected onto the five planar 
faces. 

The beauty of this scheme is that the mathematics and 
algorithms involved in these projections are well-known: 
the same clipping, projection, and hidden surface removal 
techniques used for projection of an environment onto a 
raster display screen can be used here. (Hardware op- 
timized for these operations can also be employed.) The 
view direction is set equal to each of the + Z, +X, —X. 
+ Y. and -Y axes, and every other patch in the environ- 
ment is projected onto each of the five "screens." which 
are the faces of the hemicube perpendicular to each of these 
five directions. Each full face of the cube covers a 90" frus- 
tum as viewed from the center of the cube. This creates 
clipping planes of Z = X, Z = - X, Z = Y, and Z = - Y 
that can be used in a simple Sutherland-Hodgman clipper' 1 
streamlined to handle 90° frustums. Each projected patch 
can then be scan converted or rasterized to determine 
which patch's projection covers a given pixel. If two 
patches project onto the same hemicube pixel, a Z-buffer 
algorithm can be used to decide which patch is seen in the 
discrete solid angle represented by the pixel. However, 
unlike the conventional Z-buffer algorithm used for image 
rendering, intensity data is not stored for each pixel. In- 
stead, the frame buffer is used as an item buffer to store 
an integer identifying the patch that is seen by the pixel 
represented by the item buffer address. 

After determining which patch j projects onto each 
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hemicube pixel, a summation of delta form factors for each 
pixel covered by patch j determines the furm factor from 
patch i to patch j at the center of the hemicube. That is. 

F(i = | AF„. (21) 
where AF (| is Ihedelta form factor associated with hemicube 




Imaginary 
Cube 




An imaginary cube is created around the center of a patch. 
Every other patch in the environment is projected onto the 
cube. 



(b) 

Fig. 8. (a) The hemicube. (b) Projection ol (he environment 
onto the hemicube. Copyright © 1985 by Cohen and Green- 
berg. Used with permission. 



pixel q, and R is the number of hemicube pixels covered 
by projection of patch j onto the hemicube surrounding 
element i. 

This summation is performed for each patch in the envi- 
ronment to form a complete row of N form factors. Then 
the hemicube is positioned around another patch i and the 
process is repeated. 

The delta form factor values represented by each hemi- 
cube pixel are easily calculated from the delta form factor 
equation (20J and can be stored in a lookup table. Because 
of symmetry, this table need only contain values for one 
eighth of the lop face and one hall of a side face of the 
hemicube (Fig. 9), 

In summary, the hemicube algorithm provides two main 
contributions. It provides a very practical method of nu- 
merically approximating the form factor integral, and pro- 
vides a method of properly accounting for the effects of hid- 
den and occluded surfaces at minimal additional expense. 

Substructuring 

The hemicube algorithm, as presented above, has some 
problems. Areas in a scene with high intensity (radiosity) 
gradients (shadow boundaries and penumbra, for example) 
may be poorly represented, particularly when the patches 
are large relative to the area over which the radiosity gra- 
dient occurs. To remedy this, the areas of surfaces with 
high radiosity gradients must be subdivided into finer and 
finer grids of patches. This presents two problems: how to 
increase the number of patches without incurring signifi- 
cant additional computational cost, and how to decide 
which areas of the scene should be subdivided. These prob- 
lems were addressed in a paper by Cohen and others. 12 

The solution of the radiosity simultaneous equilibrium 
equations using the Gauss-Seidel technique is 0(N 2 ), that 
is. the number of calculations required is of order N 2 , where 
N is the number of patches used to describe the scene. The 
calculation of the form factors is also 0(N 2 ). If the first 
problem is not addressed and N is naively increased, the 
computational costs can be prohibitive. 

To remedy this situation we borrow a concept from en- 
gineering mechanics known as substructuring, where the 
solution for local stress behavior is based on the global 
structure response to a coarse solution. Applying this no- 
tion to the radiosity problem, we subdivide the patches 
that are too large into a total of M elements (according to 
criteria to be discussed later), leaving K unsubdivided 
patches. It is assumed that each element has a constant 
radiosity. but that these element radiosities vary across the 
patch. Next, we would like to be able to find the radiosities 
of the elements using a solution for the radiosities of the 
original patches and avoiding a full 0{(M + K]~) solution, 
somehow applying the solution for the global patch 
radiosities to the elements. 

Element Radiosities 

To see how to do this, assume that patch i has been 
subdivided into R elements. We can then represent B,. the 
radiosity of patch i. as the average over the area of the 
patch of the element radiosities: 
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B,= flB,\, 



(22) 
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From the definition of the radiosity of an element given 
in equation 6 we know that 

B q = E q + fl,^ B|F (23) 
Substituting equation 23 into equation 22. we have 



A,. 



(24a) 



Distributing, we have 



B ' = -X, |, + % „f , ( *>« |, B i F 'i A, ) . (24b) 

If we assume the emission and reflectivity of the patch 
are constant, then E, = E q and p, = p q . Also, if the global 
radiosities B, are assumed constant for each element over 
the surface of each patch, we have 
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Fig. 9. Derivation ol delta lorm factors Copyright® 1985 by 
Cohen and Greenberg Used with permission 



b,= e^p.SbY^Sf.a). 



(24d) 



By comparing equation 6, the quantity in parentheses 
above in equation 24d is easily seen to be the patch-to-patch 
form factor expressed as the area-weighted average of the 
element-to-patch form factors, where the elements are sub- 
divisions of patch i. Thus 



(25) 



Each of the element-to-palch form factors F <1( can be found 
using the hemicube algorithm. Then the patch-to-patch 
form factors F,, are calculated using equation 25. The stan- 
dard system of simultaneous tadiosity equations (7c) can 
then be solved to yield the patch radiosities in 0[N*) time. 
The resulting patch radiosities are more accurate than those 
that would have been obtained without subdividing the 
patches into elements. This is because the expression for 
Fjj given by equation 25 represents a discrete numerical 
method for approximating the outer area integral of the 
form factor double area integral given in equation 13. Recall 
that in the original hemicube algorithm the outer integral 
was taken to be unity because we restricted the patches to 
be small relative to the distances that separate them. We 
now can remove that restriction by using equation 25, but 
placing the same restriction on the elements that make up 
a patch. 
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Once the patch radiosities have been calculated, the 
radiusity for each element q can be found using the basic 
equation for the radiosity of an element, equation 23. 

In short, subdivision of patches into elements and use 
of the above equations provides two main advantages over 
naive use of a finer patch resolution. First, the local vari- 
ations of intensity within a patch can be accurately approx- 
imated without having to solve the global radiosity equa- 
tions on an element level. Second, the radiosity solution 
on the patch level is more accurate because it better approx- 
imates the patch-to-patch form factors. 

Substructuring Algorithm 

The following are the major steps involved in employing 
these optimizations in a rendering algorithm: 

1. Form a hierarchical description of the environment con- 
sisting of surfaces, subsurfaces, patches, and elements. 

2. For each element, find the form factor to each patch 
using the hemicube algorithm, and store the results in 
an M x N matrix (M = number of elements, N = number 
of patches). 

3. Compress this matrix into an N x N patch form factor 
matrix using equation 25. 

4. For each of the primary color bands — red, green, and 
blue — form and solve the set of N equations in N un- 
knowns for the patch radiosities using the Gauss-Seidel 
iterative technique and equation 7c. 

5. Compute the M element radiosities for each element q 
using equation 23. the patch radiosities, and the ele- 
ment-to-patch form factors computed in step 2. 

6. Calculate element vertex radiosities from the radiosities 
of the elements adjacent to the vertex. 

7. Linearly interpolate the vertex radiosities across the ele- 
ments using Gouraud shading or object-space interpola- 
tion. 

Adaptive Subdivision 

We now address the criteria to be used in deciding how- 
to partition the scene hierarchically down to the element 
level. Ideally, the element mesh should be densest in re- 
gions of high intensity gradients. Cohen's paper 12 says that 
a reasonable first guess must be provided by the user as to 
which areas are likely to have high intensity gradients. 
These areas include areas in shadow and areas near light 
sources. Then the intensities of adjacent vertices found in 
step 6 can be compared. If the change in intensity is greater 
than some threshold value, the elements adjacent to that 
vertex should be recursively subdivided until the intensity 
change is below the threshold. The algorithm is then recur- 
sively repeated, beginning at step 2. 

Cohen used a simple binary subdivision, where each 
rectangular element is divided into four new elements. 
This preserves the original patch's geometry and allows 
most of the previously computed element-to-patch form 
factors to be reused. The only change that needs to be made 
to the original MxN element-to-patch form factor matrix 
to subdivide a particular element i is to remove row i from 
the matrix and insert four new element rows. (Of course, 
the hemicube algorithm must be used to calculate the ele- 
ments of the new rows). This object-space subdivision 
technique is analogous to the Warnock algorithm." which 
subdivides polygons to perform hidden surface removal. 



Progressive Refinement 

Perhaps the most significant improvement to the radios- 
ity method is the algorithm based on the technique of pro- 
gressive refinement devised by Cohen and his colleagues. 14 
This algorithm has two main advantages over those we 
have described so far. 

First, it provides renderings of the environment that are 
early approximations of the final energy-equilibrium solu- 
tion. This has the advantage of allowing the user to see 
advance previews approximating the final correctly ren- 
dered scene without having to wait for the full 0(N 2 ) solu- 
tion to equation 7c. At each step of the progressive refine- 
ment approach, the rendering of the scene gracefully con- 
verges to the full solution. The user can interactively stop 
this progression when the rendering looks good enough. 
In most cases, a useful image is produced in O(N) time. 

The second advantage of the progressive refinement ap- 
proach is a reduction in storage and start-up computational 
costs, The previous algorithms require that all form factors 
be precalculated before the Gauss-Seidel solution begins. 
This requires 0(N J ) storage. For reasonably complex envi- 
ronments, this cost can be significant. For example, an 
environment of 50.000 patches will require a gigabyte of 
storage. In the progressive refinement algorithm, form 
factors are calculated on the fly to reduce the form factor 
storage requirements to O(N) and eliminate the associated 
startup computational costs. 

The progressive refinement algorithm can be thought of 
as a restructuring of previous methods, and differs from 
them primarily in two ways. First, the radiosity of all 
patches is updated simultaneously instead of one at a time 
during each iteration. Second, patches are processed in 
sorted order according to their energy contribution to the 
environment. 

To gain an insight into how this is possible, consider 
row i of equation 7c (i.e.. equation 6). This equation may 
be thought of as one that determines the light leaving patch 
i by gathering in the light from the rest of the environment 
(Fig. 10). 

A single term from the summation in equation 6 deter- 
mines the contribution of patch j to the radiosity of patch 
i. that is, 

Contribution of B, to AB, = pjB,F„. (26) 

The progressive refinement method reverses this process 
by considering the contribution made by patch i to the 
radiosity of all other patches. The reciprocity relationship 
(equation 4) provides the basis for this reversal. The con- 
tribution of the radiosity from patch i to the radiosity of 
patch j is 

A 

Contribution of Bi to AB| = p^F,, — •. (27) 

A i 

The total contribution to the environment from the 
radiosity of patch i is determined by calculating the above 
equation for all patches j. 

A key fact about this reformulation is that the radiosities 
of the patches j in the environment are updated using form 
factors calculated via a single hemicube placed at patch i. 
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Thus, each step of the iteration no longer requires that all 
of the form factors F,, be known in advance. Each step of 
the solution now consists of placing a single hemicube 
around a patch i and adding the contribution from the 
radiositv of that patch to the radiosities of all other patches, 
calculating form factors as needed. In effect, we are shoot- 
ing light from patch i out into the environment rather than 
gathering the light from the environment received at patch 
i (Fig. 10). For a more detailed description of this iterative 
shooting algorithm, consult the literature. 14 

To arrive at the final solution as quickly as possible, we 
capitalize on the fact that if the patches i with the largest 
contribution to the environment are processed first, the 
final value for the radiositv of patch j. which is the sum 
of these contributions, will be approached earlier. Stated 
intuitively, those patches radiating the most light energy 
should be treated first, since they have the greatest effect 
on the illumination of environment. This energy will tend 
to come from those patches having the largest product B,A,. 

Accordingly, the progressive refinement algorithm is im- 
plemented by always shooting first from patches for which 
the difference AB|A, between the previous and current es- 
timates of unshot radiositv is greatest. This usually results 
in most light sources being processed first, followed by the 
patches that receive the most light from the light sources, 
and so on. Thus, when solving in sorted order, the solution 
tends to proceed in nearly the same order as light would 
propagate through the environment. Solving in sorted order 
usually yields a useful estimate of the final solution in less 
than a single full iteration, substantially reducing compu- 
tation costs. 1 ' 1 Fig. 2 was rendered using a progressive re- 
finement technique. 

Summary and Conclusions 

The radiositv global illumination method combats many 
of the deficiencies of ray tracing. Radiositv methods pro- 
duce excellent penumbras, shadow envelopes, and color 



bleeding effects. Area light sources are accurately modeled. 
The radiositv model is "correct" in the sense that it is based 
on laws of physics | energy equilibrium because of conser- 
vation of energy) In the radiositv method, illumination 
modeling is decoupled from scan conversion and render- 
ing. Finally, radiositv algorithms are view independent. 
This allows a high degree of interactivity for static geometry 
once the preprocessing is complete. 

Despite these advantages, radiositv also has some disad- 
vantages with respect to ray tracing. Rendering using the 
full radiositv solution is slow (although proponents claim 
it is faster than ray tracing because it is view independent). 
Also, specular reflections, transparency, and translucency 
are not modeled. 

Radiositv and ray tracing are complementary methods. 
No one method models reality perfectly (although radiositv 
advocates point out that most natural environments are 
predominantly diffuse). Recent research involves combin- 
ing aspects of both methods. For example, in a very recent 
paper by Wallace and his colleagues, 17 a ray tracing 
technique is used to compute form factors, instead of the 
hemicube algorithm. Also, techniques have been recently- 
proposed for producing specular highlights along with 
global diffuse illumination components. ,0,,51B 

There is still a fair amount of research that needs to be 
done before an interactive global model can be offered that 
models an environment perfectly without having to sac- 
rifice diffuse components, as in ray tracing, or specular 
highlights, as in radiositv. However, the illumination mod- 
els that have been developed to date are extremely useful 
and should be made available to users of graphics worksta- 
tions, Accordingly, Hewlett-Packard chose to become the 
first workstation vendor to offer radiosity-based illumina- 
tion modeling as well as the more traditional methods. In 
July 1989, HP released its Starbase Radiositv and Ray Trac- 
ing software, which integrates into the Starbase display list 
support for both radiositv and ray tracing on high-end Tur- 
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Fig. 10. Gathering versus shoot- 
ing. Gathering light through a 
hemicube allows one patch 
radiosity lo be updated Shooting 
light through a single hemicube al- 
lows the entire environment's 
radiosity values to be updated 
simultaneously Copyright 0 1989 
by Cohen. Chan, Wallace, and 
Greenberg. Used with permission 
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boSRX workstations. This gives the graphics programmer 
a choice of three illumination methods: local illumination 
based on an enhanced Phong model, global illumination 
based on anti-aliased ray tracing, and global illumination 
using progressive refinement radiosity. Applications using 
the Starbase display list can now be written to provide the 
user with the widest possible variety of photorealistic ren- 
dering. 

We have presented a tutorial summary of the theory and 
algorithms of the radiosily method Ilia! have appeared in 
the literature over the last few years. We have done so with 
the hope that the reader will gain an intuitive feel for the 
method, some of the improvements that have been made 
to it, and the advantages that may be gained from it. No 
attempl has been made to discuss the particulars of HP's 
implementation, which makes use of the most recent ad- 
vances. 14,17 
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